search cancel

Monitor a server's (robot) power off or unreachability and send a Nimbus email alert


Article ID: 241721


Updated On:


DX Unified Infrastructure Management (Nimsoft / UIM) DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM) Unified Infrastructure Management for Mainframe


Our customer has been requested us to monitor a server's (robot) power off or unreachability and auto-send a Nimbus email alert to their related team. What would be the best practice for me to create this monitor through the Infrastructure Manager? Also, my customer lists some hundreds of servers. Would this monitoring affect the overhead of Nimbus or Network/Database traffic?


Release : 20.3 or higher

Component : UIM - ROBOT


- Guidance


Normally, Availability measures system 'uptime' and Reachability measures device connectivity.

Availability is the percentage of time that the device is powered on and also capable of processing data. A device that is 'Available' might still be unreachable because of a network or communications failure by another device.

Reachability refers to whether a device is reachable from the source. Typically, data sources use ICMP (ping testing) to communicate regularly with the target device. Any communication failures, including the loss of the network path or routing, affect the reachability statistics. If ICMP is blocked, and you cant use the net_connect probe to ping the device, you can use the snmpcollector probe determine reachability.

Reachability data comes from regular ping testing of all devices that support ICMP. A reachability value can be the percentage ping responses that are received from the device during each reporting interval. You can use net_connect or icmp for ping monitoring.

UIM Availability (via QOS_POWER_STATE)

Note that QOS_POWER_STATE data is collected for hubs/robots by default and should not be changed/disabled.

QOS_POWER_STATE is a QOS sent by the robot to help you to generate reports and its used for availability calculation.
Runs every 5 minutes and values are collected as 0's and 1's.

Default values are 0 for down and 1 for Up.

How are Reachability and Availability in snmpcollector calculated?

Availability Reports (OOTB Reports but requires cabi_bundled)

Community Post: (unsupported, custom scripts/callbacks)

UIM robots_checker (check probes, and do callbacks on it). This probe has been created to do self-monitoring of UIM Hubs and robots.
net_connect and icmp can send alarms based on the monitoring results. snmpcollector can also be configured to send alarms.

In a properly sized environment, this will not have any adverse affect on overhead of Nimbus or traffic. For more information on UIM Sizing Requirements please refer to:

UIM Sizing Requirements