Upgrade to rsp probe 5.52 from 5.51 causes all profiles to alarm
search cancel

Upgrade to rsp probe 5.52 from 5.51 causes all profiles to alarm

book

Article ID: 259022

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management On-Premise (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

Post upgrade to UIM 20.4 CU, we get an alarm storm of these error messages for every profile due to rsp upgrade:

  Database integrity problem with <xxx.example.com>, you should re-run discovery on the system

Then we get an alarm for every disk, cpu, etc. on every profile, with around 1300 alarms total.

Environment

  • Release: 20.4 CU5

Cause

  • Potential rsp database.db corruption, or loss of access/connectivity to host(s)

Resolution

Please follow the steps below:

  1. Deactivate the current rsp probe

  2. Rename the database.db in the rsp folder, e.g., database.bkp

  3. Activate the rsp probe

  4. Rediscover all monitoring targets via Rt-Click ->Rediscover in the rsp GUI in the IM

  5. Acknowledge all rsp alarms

  6. Restart the rsp probe

Let us know the results and please send a screenshot of any alarms you receive but please note the following related to potential causes of alarms:

  • connection error - occurs when connection to host failed or login refused. As per the log output for any of these alarms situations, you can check the connectivity via ping / using the auth type and credentials

  • connection timeout - occurs when connection to host 'timed out'

  • data collection failure - occurs when data collection failure for checkpoint on host happened

  • database integrity - occurs when database integrity problem occurred with host

  • duplicate data - occurs when possibly duplicated data in QoS series in the database data collection failure occurred for checkpoint as CDM probe is also running on the same host

  • duplicate series - occurs when possibly duplicated data series in the QoS database as CDM probe is also running on the same host

Additional Information

  • These rsp alarms/errors indicate that information on the system is missing from the discovery tables internally in the rsp database (database.db)

    Normally this would be caused by either the information becoming corrupted, or the host being renamed, and causes problems since the probe relies on finding host data in those tables so that it doesn't have to look it up every time. 

  • The rediscover alarms mean that the server doesn't exist in the SQLite database for rsp

  • In rsp, rediscover the machines (GUI, RT-click, rediscover) - this should stop the rediscover alarms.

    Action->Rediscover the hosts.

  • Optionally, you can use the probe utility and specify each individual host.

    The discover_host probe utility callback will gather all the information about a system and put it into the database.db

Attachments

rsp-5.52-T1_1678889774953.zip get_app