Net_connect probe stopped to send ping alarms
search cancel

Net_connect probe stopped to send ping alarms

book

Article ID: 193873

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

We applied a Network Connectivity configuration to group of servers via Monitoring Configuration Service (MCS) UIM was monitoring ping correctly on these servers, but after updating net_connect_mcs_templates to version 3.42, the net_connect probe stopped sending ping alarms.

We noticed that the new profiles created by MCS have incorrect alarm messages configured in the host properties.

So, we changed the alarm messages in one of the profiles to the correct value (MsgConnectOk and MsgConnectFail), interrupted the ICMP protocol on the monitored server and, in this way, it was possible to see the probe generating the alarm correctly.

The problem is that we have hundreds of servers monitored by net_connect and new profiles are created automatically when new servers are added to the MCS group.

Environment

Release : 9.2.0

Component : UIM NET CONNECT

Resolution

In place of using Ping QOS, should use Packet Loss because we have a limitation in Operator Console / Alarm policy for Ping Alarm threshold.

To detect if the Server is down or not reachable, need to configure Packet Loss threshold only.

In the Operator Console / Alarm policy, we have limitation to set threshold for NULL i.e. “-987654321” value.

If the Server is down or not reachable we get ping response time NULL which is not being handled by Alarm Policy. It is handled if we are using legacy profile, Admin console or IM GUI as alarm is managed by Probe and it handle NULL value for ping Alarm.

Need to set Packet Loss Threshold condition below 10-15% for detecting the machine is down as I can see we are getting value between 20 to 50%  in case of Server is not getting successful ping.

When using MCS enhance template, you should not configure net_connect using any other mean like Admin console or IM GUI. Also we need to check whether Alarm policy applied correctly or not on net_connect robot host.