Generate processes down alarm only after two consecutive checks
search cancel

Generate processes down alarm only after two consecutive checks

book

Article ID: 95863

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management On-Premise (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

Would like to send an alarm only after the given process has been checked and found to be down 2x (not on the first check of the of the process state).

How to send an alarm only after the given process has been checked and found to be down 2x (and not alarm on the first check of the of the process 'down' state)?

Environment

  • Any supported UIM version
  • processes (any version)
  • `nas (any version)

Resolution

All incoming alarms that match would be set to invisible, hence the 2nd AO profile to set them to visible when the count is >= (gte) 2. This profile would run on an interval basis to ensure they remained visible.

Tips: You can try using the processes probe default of 60 seconds and for the nas 2nd rule, "On every interval" of 2m. Or simply adjust both settings for the desired effect/timing.

Note:

Stop the process to generate the alarm.

Start the process to clear the alarm.
 
Nas AO profile 1:
 
Action type: set_visibility
Make event invisible
Action mode: On message arrival
Message string: /.*<process_name>.*/
Message Counter: Equals to 1
 
NOTE: You may have to adjust the processes check interval and the nas rule On Interval setting to make it practical. Maybe set the 'On interval' to slightly after the processes check interval for instance.



Nas AO profile 2:
 
Action type: set_visibility
Make event visible
Action mode: On every interval
Message string: /.*<process_name>.*/
Message Counter: Greater than or equal 2

 
Note that the Count continues to rise and the alarm remains until it is cleared. You can test it by starting the given process again.

Additional Information

Tips:

  • Use the nas GUI Status Tab/window to watch the alarms come in (invisible), then watch the alarms change to visible (when the count reaches 2).
  • If you set the process probe monitoring interval to 300 seconds, set the 'On every interval' value in the 2nd rule a bit higher such as 5 minutes (5m). 'On every interval is simply the interval you specify for when the rule should be run.
  • If the probe check interval is set too high, the monitored processes statuses may change between intervals.
     
    • For example, if the interval is set to 10 minutes (10m), a process may be down and trigger an alarm at the first interval, come up and then go back down before the second occurrence of the interval.  This would trigger the alarm again. This may or may not be what you want depending upon your use case. It is important to take this into consideration.