A possible cause of this issue can be that the maintenance_mode probe was not reachable at the time the issue occurred. Therefore the NAS was not able to register with maintenance_mode.
This caused the NAS to discard the maintenance schedule.
The NAS collects maintenance schedules from the maintenance_mode probe "at run time" and this resulted in an alarm 'leak' for that period.
Note also that the cause for maintenance_mode probe not reachable could be due to network or other DB issues.
Something that may indicate a DB issue could be verified in the maintenace_mode log as a registration failure:
Example:
logs at Apr 12 00:30:16:588 WARN / SQLServerException
Exception started at:
Apr 12 00:30:16:588 WARN [attach_socket, com.nimsoft.monitor.probe.MaintenanceModeProbe] Failure registering to maintenance_mode. org.springframework.dao.DataAccessResourceFailureException: StatementCallback
Exception continued till
Apr 12 00:52:03:871 WARN [attach_socket, com.nimsoft.monitor.probe.MaintenanceModeProbe] Failure registering to maintenance_mode. org.springframework.dao.DataAccessResourceFailureException
A new key was introduced that may help overcome/workaround this issue in case of a similar scenario.
The new parameter "maint_sched_discard" is available that lets you decide whether you want to discard the maintenance schedule.
You can specify the value as yes or no. A value of no implies that the maintenance schedule will be retained.
The value is found under the nas' setup section via raw configure.
The default is->
maint_sched_discard = yes
nas (Alarm Server) Release Notes
Setting it to no:
maint_sched_discard = no
The maintenance mode schedules won't be discarded if maintenance_mode is not reachable in a similar scenario.
Make the following adjustments to the nas and ems probes:
Open the nas probe in Raw Configure mode and set the following parameter under the <setup> section:
maint_max_resp_time = 50
registrationIntervalLookAheadMinutes = 60
Run raw configure mode for the ems probe and set the following under 'setup':
maintenance_mode_cmd_timeout = 300000
The new parameter requires a minimum NAS version 9.32 but nas 9.32HF1 is recommended.
This may also help resolve issues with robot_inactive alarms being generated as false alerts sent during maintenance schedules.
Note also that Informational and clear alarm messages are still displayed during a maintenance window.