cdm reboot alarm not getting triggered
search cancel

cdm reboot alarm not getting triggered

book

Article ID: 191835

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

Having  issues with reboot alarms from cdm probe. In some cases the alarm is created like it should and in other cases there is no alarm. 
Also sometimes when the server crashes the cdm.data file is empty when the server starts up.
When the server is rebooted normally the file contains the time stamp.

From our tests we saw that ,when the file is empty after reboot there is no alarm.
How can we fix this behavior ? 

 




Environment

Release: UIM 20.3/20.4

Component : UIM - CDM WITH IOSTAT

cdm : 6.70/ 6.80/6.80HF1

Robot : 9.33HF6 running on RHEL 8

Cause

- possible corruption of cdm.data file

 

Resolution

Make sure you are using CDM 6.8 and above. If still occurs follow below steps:

Perhaps the alarm is being cleared. Rt-click within the alarm sub-console window and look back over time at the alarm history to see if you see the expected reboot alarms were generated, but perhaps cleared because NOTE that the alarm is Informational, and there is an out of the box nas rule to automatically clear Informational alarms after 3 days. Or perhaps there is another nas rule clearing the reboot Informational alarm or making it invisible. The nas profile rule is named: “Automatic cleanup of low-severity messages”

In the cdm.cfg you can see the BootAlarm message configuration:

<BootAlarm>

      text = $check_name: Computer has been rebooted at $boot_time

      level = information

      subsystem = 1.1.1

      token = boot_alarm

      i18n_token = as#system.cdm.comp_reboot

   </BootAlarm>

 

Also, you can run the following command on a given system where you suspect no reboot alarm is being generated:

systeminfo | find /i “Boot Time” (then search for a reboot alarm in and around that time frame in nas_alarms or nas_transaction_summary to check).

You may also run the following queries to examine the reboot alarms that did occur. But please ALSO attach the resultant output in .csv format when working with Support.

   select * from nas_alarms where message like '%rebooted%'

   select * from nas_transaction_summary where message like '%rebooted%'

 

Here is how the cdm information is found and alerted on:

Below are the commands that are used:

- On Unix--> /proc/uptime

- On Windows--> We determine the uptime from the 'System' performance monitor (perfmon) counter and calculate the boot time from there.

   Perfmon Counter->System-> "System Up Time"

System Up Time is the elapsed time (in seconds) that the computer has been running since it was last started. This counter displays the difference between the start time and the current time.


There is a file in the cdm folder->   …\Program Files (x86)\Nimsoft\probes\system\cdm

The file contains the uptime value. If the uptime value obtained is different from the value in the file then an alarm is sent.

The file name is 'cdm.data' and has a structure as follows:

<server>

   boot_time = 1550241585

</server>

You can open the file with Notepad.

 

Also please check to see if your cdm cfg shows that the boot alarm is enabled:

 

To review  the system boot time you can run this command on a Windows system:

C:\Users\Administrator>wmic os get lastbootuptime

LastBootUpTime

20200527124629.487263-420

On my own system, I checked the last boot time and compared it with the value in the cdm.data file and it is different. The value is in EPOCH time so it must be converted to compare it to the value yielded by the wmic command. You can convert it at:  https://www.epochconverter.com/

1590783698 which is Friday, May 29, 2020 4:21:38

So the values are different and I should see a boot alarm so I checked and did find one:

Attachments