I noticed that in the CM_NIMBUS_ROBOT table we have "check_time" and "alive_time" fields; however, it appears these fields are not being kept current. For example, I see numerous robots that are online and have been online for some time, but check_time and alive_time show old dates.
What do these fields represent, and what updates them?
Is it possible to increase the frequency of these checks?
Applies to all versions of discovery_server from UIM 8.x through 23.x
Working as expected.
Although the discovery_server regularly checks the robots for updates (for example, newly added metrics), the check_time and alive_time fields are not updated on every check interval as might be expected.
In UIM 20.4 and prior, the niscache scan process checks the robots on a regular interval, but this check does not update these fields.
In UIM 23.x and forward, the hubs "push" new niscache updates to discovery_server, but this also does not update these fields.
Instead, in all versions of UIM, these fields are only updated when the discovery_server checks the robot in response to a status change event for that robot. Status changes mean that an offline robot came online, or an online robot went offline. This includes a robot restart.
When a robot starts/stops/restarts, its hub reports the change in status of that robot to discovery_server. In turn, discovery_server attempts to contact the robot in order to verify the status.
If the robot was started or restarted, this check will generally succeed, and check_time and alive_time will both be updated to the current timestamp.
If the robot was stopped/went offline, this check will fail; in that case check_time gets updated, and alive_time does not.
Additionally, when discovery_server is restarted, it will re-check all the robots and update check_time and alive_time with a current timestamp.
So in summary, the check_time and alive_time are only updated in CM_NIMBUS_ROBOT when a robot stops/starts/restarts, or when discovery_server is restarted.