Alarms of inexistent CPU and File Systems from RSP
search cancel

Alarms of inexistent CPU and File Systems from RSP

book

Article ID: 220820

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

We are getting alarms like: "CPU XX data not found" or "No data for /XXXXX currently available" from rsp probe from CPU or Filesystems that are not listed in the profile.

I tried to rediscover the device in the profile, but the problem persist.

Environment

Release : 9.0.2

Component : UIM - RSP 5.50

Cause

The profiles monitored  by RSP probe are stored in the rsp.cfg.
If the device doesn't have those File Systems but mainly those CPU mentioned in the alarm, the most likely cause is that the original device was replaced by a different one, with the same IP address.

From rsp logs we can see the CPU is being monitored, but it has not been discovered

Jul 27 11:14:51:860 [140053616019200] rsp: chk_cpu: failed to get latest entry from the database for cpuid 14 on XXXXXX.mon
Jul 27 11:14:51:864 [140053616019200] rsp: chk_cpus: XXXXXX.mon-cpu failed to check 14
Jul 27 11:14:51:864 [140053616019200] rsp: dbGetDiscoveryCpu: ndbExecute exec error(4): not found
Jul 27 11:14:51:864 [140053616019200] rsp: chk_cpus: cpu 14 has not been discovered on XXXXXX.mon - deactivaing profile

Reviewing the rsp.cfg we can see that the inexistent CPU is listed in there despite the rediscover:

 <XXXXXX.mon>
      active = yes
    credentials = passwords
      group = AIX_7
    host = XXXXXX.mon
      os = AIX_7
      type = ssh
      time_interval = 2 min
      <cpu>
         <0>
            name = 0
            cpuid = 0
            instance = yes
         </0>
         <1>
            name = 1
            cpuid = 1
            instance = yes
         </1>
...
         <14>
            name = 14
            cpuid = 14
            instance = yes
         </14>
         <15>
            name = 15
            cpuid = 15
            instance = yes
         </15>
         <16>
            name = 16
            cpuid = 16
            instance = yes
         </16>
         <17>
            name = 17
            cpuid = 17
            instance = yes
         </17>
         <18>
            name = 18
            cpuid = 18
            instance = yes
         </18>
         <19>
            name = 19
            cpuid = 19
            instance = yes
         </19>
      </cpu>

Resolution

The recommended would be to delete the profile and recreate it, as probably the additional CPU or file systems are not the only difference.
Manually editing the rsp.cfg is possible, but it needs to be done carefully, to avoid to corrupt the file, and cause a more serious problem. 
Make sure to backup the file before editing it.