ALERT: Some images may not load properly within the Knowledge Base Article. If you see a broken image, please right-click and select 'Open image in a new tab'. We apologize for this inconvenience.

Alert got sent out for device under maintenance when UIM server was rebooted

book

Article ID: 200351

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM) Unified Infrastructure Management for Mainframe

Issue/Introduction

Observed that when maintenance was performed for UIM servers, alert for other servers that were under maintenance will be sent out when the primary hub is up. It seems like the alarm already came in while maintenance_mode probe was still querying the database for the maintenance windows. Is there is any way to prevent or resolve this issue?


Environment

Release : 20.1

Component : UIM MAINTENANCE MODE

Resolution

Please review the KB Article:

When the UIM service in the Primary HUB Robot is restarted, we saw alarms from maintenance mode device.
https://knowledge.broadcom.com/external/article?articleId=6989

You can also try increasing the timeouts in the ems and nas probe as described below:

1) ems probe

In <setup> section:

Add the key  maintenance_mode_cmd_timeout = 300000
if it does not exist. The value is in ms.


2) nas probe

in raw configuration or cfg file:

in <setup> section:

add the key

maint_max_resp_time = 300

if it does not exist. The value is in seconds.

3) Also in the nas setup section 

add the key:

registrationIntervalLookAheadMinutes = 60

Additional Information

registrationIntervalLookAheadMinutes
This key determines how often the nas will attempt to re-register with the maintenance_mode probe.
This key is set to 30 minutes by default, but if you have a large number of active schedules defined which do not get changed frequently, then you can increase the re-registration interval as well to 60 or even 90 minutes.
 
maint_max_resp_time
This setting can be added/configured to override the default timeout period. The default period is 20 seconds for re-registration with the maintenance_mode probe. This setting fixes an issue where the nas intermittently alarms through for devices that are in maintenance when the maintenance_mode probe fails to respond to the re-registration request from the nas within the default timeout period.