cdm reboot alarm not getting triggered
search cancel

cdm reboot alarm not getting triggered

book

Article ID: 191835

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management On-Premise (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

  • Having  issues with reboot alarms from cdm probe. In some cases the alarm is created like it should and in other cases there is no alarm. 
  • Also sometimes when the server crashes the cdm.data file is empty when the server starts up.
  • When the server is rebooted normally the file contains the time stamp.
  • From our tests we saw that when the file is empty after reboot there is no alarm.

Environment

  • Release: DX UIM 20.4.x or higher
  • Component : UIM - CDM WITH IOSTAT
  • cdm: any

Cause

  • possible corruption of cdm.data file in this scenario/case

Resolution

1. Make sure you are using CDM 6.84 or higher. If the issue still occurs after upgrading cdm, follow the steps below.

2. Check to see if the alarm is being cleared or excluded by a NAS AO profile or preprocessing rule.

Rt-click within the alarm sub-console window and look back over time at the alarm history to see if you see the expected reboot alarms were generated, but perhaps cleared because NOTE that the alarm is Informational, and there is an out of the box nas rule to automatically clear Informational alarms after 3 days.

Or perhaps there is another nas rule clearing the reboot Informational alarm or making it invisible. The nas profile rule is named: “Automatic cleanup of low-severity messages”

In the cdm.cfg you can see the BootAlarm message configuration:

<BootAlarm>
      text = $check_name: Computer has been rebooted at $boot_time
      level = information
      subsystem = 1.1.1
      token = boot_alarm
      i18n_token = as#system.cdm.comp_reboot
   </BootAlarm>

IMPORTANT: It is imperative to also check your nas.cfg for any nas AO Profiles or preprcessing rules that may be closing, or excluding the boot alarms!


3. Check the cdm.log at loglevel 5 and use a logsize of 50000 and see if the boot alarm is being generated.

Here is an example of the log entry you would see in the cdm.log:

cdm: doWork - boot time alarm sent, adjusting last boot time checked: 1550241585 -> 1743601102


4. Also, you can run the following command on a given system where you suspect no reboot alarm is being generated:

systeminfo | find /i “Boot Time” (then SEARCH for a boot alarm in and around that time frame in nas_alarms or nas_transaction_summary to check).


5. You may also run the following queries to examine the reboot alarms that did occur. But please ALSO attach the resultant output in .csv format when working with Support.

   select * from nas_alarms where message like '%rebooted%'

   select * from nas_transaction_summary where message like '%rebooted%'


Here is how the cdm information is found and alerted on:

Below are the commands that are used:

- On Unix--> /proc/uptime

- On Windows--> We determine the uptime from the 'System' performance monitor (perfmon) counter and calculate the boot time from there.

   Perfmon Counter->System-> "System Up Time"

System Up Time is the elapsed time (in seconds) that the computer has been running since it was last started. This counter displays the difference between the start time and the current time.


There is a file in the cdm folder->   …\Program Files (x86)\Nimsoft\probes\system\cdm

The file contains the uptime value. If the uptime value obtained is different from the value in the file then an alarm is sent.

The file name is 'cdm.data' and has a structure as follows:

<server>

   boot_time = 1550241585

</server>

You can open the file with Notepad.

 

6. Check to see if your cdm cfg shows that the boot_alarm parameter value is set to yes:

 

To review  the system boot time you can run this command on a Windows system:

C:\Users\Administrator>wmic os get lastbootuptime

LastBootUpTime

20200527124629.487263-420

On my own system, I checked the last boot time and compared it with the value in the cdm.data file and it is different. The value is in EPOCH time so it must be converted to compare it to the value yielded by the wmic command. You can convert it at:  https://www.epochconverter.com/

1590783698 which is Friday, May 29, 2020 4:21:38

So the values are different and I should see a boot alarm so I checked and did find one:

Additional Information

When you don't receive/see the alarm for cdm or other probes, you should always check to make sure that the alarm messages are making their way from a downstrream secondary/remote hub up to the upstream/Primary hub via ATTACH/GET queues or nas Forwarding and Replication.