how can I ensure my UIM installation is up and running after a reboot?

book

Article ID: 220745

calendar_today

Updated On:

Products

DX Infrastructure Management

Issue/Introduction

Post reboot we need to validate the servers and make sure all our services and applications are running as expected.

Can you please help the steps or things we need to validate ( by way of automation/script) to check that all is well ?

Environment

Release : 20.3

Component : UIM - ROBOT

Resolution

The best I can do is refer you to some previous discussions on this, and some suggestions; but this technically falls under a customization issue (related to scripting/automation) and so it is outside of the scope of Support.  Normally for any customization/automation discussions we would refer to the Communities, or an engagement with Broadcom Services.

With that in mind here is a link to a discussion on this from the communities which has some ideas:

https://community.broadcom.com/communities/community-home/digestviewer/viewthread?MID=767441#bm9183840d-dfa3-4643-86fe-5536cc298764

The following KB contains a script (unsupported) which can be run inside the script editor on the NAS probe which can help to check probe status:

https://knowledge.broadcom.com/external/article?articleId=34985

And here is another script which can use a NAS Auto Operator to automatically check for "probe failed to start, file check determines changes in the probe" error and validate security.

https://knowledge.broadcom.com/external/article/34372/automatically-validate-hdb-and-spooler-p.html

 

There are also some additional (older) discussions available such as:

https://community.broadcom.com/communities/community-home/digestviewer/viewthread?MID=756808#bm3103b328-8c70-4ca6-8191-b830399c9bef

But as mentioned we don't really have much to offer in the way of automating these health checks other than what is linked above.

Additional Information

Here are some additional log messages and functionality which can be monitored which may be of interest:

robot:

In robot log you should always see the robot successfully establishing contact with hub:

Jul 30 14:50:21:912 [140491579033408] 0 Controller: Hub CoreA(10.173.36.161) contact established

 

When robot is fully started it must be in LISTENING state for TCP connections (can be checked e.g. via netstat, or use telnet or other utility) on ports:

48000
48001


hub

When hub finishes starting up the following will always be logged:

Jul 30 14:52:21:072 [8152] 0 hub: hubi main thread started 


When hub is fully started it must be in LISTENING state for TCP connections on ports:

48000
48001
48002

data_engine:

When data_engine fully starts up you would see the following:

Jul 30 14:42:33:229 [12256] 0 de: data_engine starting main processing loop. vbRun=1 vbShutdown=0 

 

wasp probe always logs the following when it is fully started up:

Jul 30 14:44:08:005 INFO  [main, com.nimsoft.nimbus.NimProbe] ****************[ Starting ]**************** 

Additionally you can check wasp.cfg and look for http_port and/or https_port setting (usually 80 or 8080 or 443 or 8443).   These port(s) should be listening for TCP connections and similar to hub/robot ports you can check them for availability.


CABI:

when cabi probe is operational it will always log the following:

Jul 30 14:53:01:595 [UserSynchronizationThread, cabi] Finished synchronizing users between UIM and CABI 



For any other probe which might be of concern you can follow a process like:

- set loglevel on probe to "0"
- deactivate/activate probe
- check which messages are logged consistently on probe startup.

Any log message which appears on level 0 would also appear at any other loglevel.