Healthcheck shows NCM Device Servers as "offline" even when Device Server tasks such as autodiscovery, pull jobs, test credentials and others are completing without issue
book
Article ID: 330948
calendar_today
Updated On:
Products
VMware Smart Assurance
Issue/Introduction
Healthcheck showing Device Servers as offline within the NCM system administration console even when the device server is otherwise functioning properly. There are times where seeing the Device Server as offline can cause concern.
Environment
10.1.x - NCM
Cause
This issue can be caused by the healthcheck.pl script not allowing enough time between the creation and moving of the command files on the Smarts NCM Application Server for each Device Server.
Resolution
This issue can be addressed by updating the healthcheck.pl script to allow sufficient time between cycles. This can be done as follows:
Log into the Smarts NCM Application Server with root/administrator privileges according to the requirements of your operating system.
Run the following command to stop the healthcheck service on the Application Server:
/etc/init.d/healthcheck stop
3. Update the healthcheck.pl script to allow more time between command file creation as follows:
Open the healthcheck.pl script file found here:
$VOYENCE_HOME/Healthcheck/healthcheck.pl
Look for 'sleep 1'. This is found in two places, lines 227 and 265.
Change it to 'sleep 10' (or other desired value) in both places.
Save the file
5. Run the following command to stop the healthcheck service on the Application Server:
/etc/init.d/healthcheck start
6. Let the healthcheck run a few times and then confirm that the status of Device Servers are correctly displayed.