Alarms are getting triggered when the servers are in maintenance mode
search cancel

Alarms are getting triggered when the servers are in maintenance mode

book

Article ID: 189694

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

alarm suppression is not happening during maintenance window.

Environment

Release : 8.51

Component : UIM - ALARM POLICY

Resolution

First, please upgrade your nas and maintenance_mode probes to:

maintenance_mode-8.53-HF3

and

nas 8.56HF5

Both can be downloaded from the UIM hotfix depot:
https://techdocs.broadcom.com/us/product-content/recommended-reading/technical-document-index/ca-unified-infrastructure-management-hotfix-index.html?r=2

The maintenance_mode probe may be periodically taking more than 20 seconds to build the list of devices in maintenance mode. The GA release of the nas probe by default will only wait for 20 seconds for a response from the maintenance_mode probe. If a response is not received within 20 seconds, then the nas probe closes the connection taking all devices out of maintenance mode.

Add this key to the nas.

   maint_max_resp_time

to extend the timeout:

This gives the nas probe more time to complete the re-registration request with the maintenance_mode probe so that the list of dev_ids is not lost on a re-registration timeout.

Example:
From the nas probe's Raw Configure GUI, select the setup folder from the left-hand pane then Add/Modify the following key value:

   maint_max_resp_time = 50

From the nas probe's Raw Configure GUI, select the setup folder from the left-hand pane then also Add/Modify the following key value:

   registrationIntervalLookAheadMinutes = 60

This key determines how often the nas will attempt to re-register with the maintenance_mode probe.

This key is set to 30 minutes by default, but if you have a large number of active maintenance schedules defined which do not get changed frequently, then you can increase the re-registration interval as well to 60 and even 90 minutes.

maintenance_mode probe 8.53 HF3 introduces a new feature.

It has a new task in maintenance mode which deletes the expired maintenance windows thereby improving the UMP performance. It is disabled by default.

To enable the task, purge_maintenance_window_interval (a new configuration) is to be set to an whole integer in the maintenance mode probe configuration under the section.
(say 1, meaning the task would run every one hour).

For example,

Under the setup section,

   purge_maintenance_window_interval = 1

If the task is run all the expired maintenance window entries would get deleted.

Apart from the above task, whenever a schedule is deleted, the corresponding maintenance windows would be deleted. This improves maintenance_mode performance.

To manually cleanup maintenance schedules please refer to the following KB Article:

"Cleaning up maintenance schedules'
https://knowledge.broadcom.com/external/article?articleId=106216