cdm reboot alarm not getting triggered

book

Article ID: 191835

calendar_today

Updated On:

Products

NIMSOFT PROBES DX Infrastructure Management

Issue/Introduction

In most of the scenarios, we are not seeing the alarm getting triggered for the server reboot. This is causing a major escalation. Please help in fixing this.

cdm probe version - 6.30
UIM version  - 8.5.1

Cause

- possible corruption of cdm.data file

- issue occurred with cdm 6.30, and a fix was developed. In later cdm versions this issue should not occur.

Environment

- Release: 8.51

- cdm 6.30

Resolution

  1. What version(s) of the robot did this occur with where the cdm probe is installed?
  2. Check if this is happening on Windows or also on UNIX/Linux?
  3. Check how many servers there are where the boot alarms are not being generated?

Perhaps the alarm is being cleared. Rt-click within the alarm sub-console window and look back over time at the alarm history to see if you see the expected reboot alarms were generated, but perhaps cleared because NOTE that the alarm is Informational, and there is an out of the box nas rule to automatically clear Informational alarms after 3 days. Or perhaps there is another nas rule clearing the reboot Informational alarm or making it invisible. The nas profile rule is named: “Automatic cleanup of low-severity messages”

In the cdm.cfg you can see the BootAlarm message configuration:

<BootAlarm>

      text = $check_name: Computer has been rebooted at $boot_time

      level = information

      subsystem = 1.1.1

      token = boot_alarm

      i18n_token = as#system.cdm.comp_reboot

   </BootAlarm>

 

Also, you can run the following command on a given system where you suspect no reboot alarm is being generated:

systeminfo | find /i “Boot Time” (then search for a reboot alarm in and around that time frame in nas_alarms or nas_transaction_summary to check).

You may also run the following queries to examine the reboot alarms that did occur. But please ALSO attach the resultant output in .csv format when working with Support.

   select * from nas_alarms where message like '%rebooted%'

   select * from nas_transaction_summary where message like '%rebooted%'

 

Here is how the cdm information is found and alerted on:

Below are the commands that are used:

- On Unix--> /proc/uptime

- On Windows--> We determine the uptime from the 'System' performance monitor (perfmon) counter and calculate the boot time from there.

   Perfmon Counter->System-> "System Up Time"

System Up Time is the elapsed time (in seconds) that the computer has been running since it was last started. This counter displays the difference between the start time and the current time.


There is a file in the cdm folder->   …\Program Files (x86)\Nimsoft\probes\system\cdm

The file contains the uptime value. If the uptime value obtained is different from the value in the file then an alarm is sent.

The file name is 'cdm.data' and has a structure as follows:

<server>

   boot_time = 1550241585

</server>

You can open the file with Notepad.

 

Also please check to see if your cdm cfg shows that the boot alarm is enabled:

 

To review  the system boot time you can run this command on a Windows system:

C:\Users\Administrator>wmic os get lastbootuptime

LastBootUpTime

20200527124629.487263-420

On my own system, I checked the last boot time and compared it with the value in the cdm.data file and it is different. The value is in EPOCH time so it must be converted to compare it to the value yielded by the wmic command. You can convert it at:  https://www.epochconverter.com/

1590783698 which is Friday, May 29, 2020 4:21:38

So the values are different and I should see a boot alarm so I checked and did find one:

Attachments