How does NAS storm protection work?
How can I prevent alarm storms to affect my DX UIM NAS?
After a message flood affected the alarms in UIM, how can I prevent probes to send massive number of alarms to the NAS?
DX UIM 20.4.* / 23.4.*
Guidance
How does NAS Storm protection work:
storm_threshold
) within a specified time-window (storm_timewindow
) then succeeding alarms will be quarantined by re-publishing the message to configured Subject (storm_subject
). setup > storm_message
setup > storm_severity_level
storm_message
supports variable expansion from the message header, e.g.
Placing alarm(s) from $domain:$origin:$robot:$prid:suppkey=$supp_key, total:%d
would be represented as:
storm_severity_level
storm_severity_level = 5
This would represent changing the alarm severity to Critical.
The storm_protection
value causes the key “signature” elements to be:
0. disabled
1. source, domain, robot, probe-id and supp_key
2. source, domain, robot, probe-id
3. source, domain, robot
How to enable NAS Storm Protection:
Note:
The Storm capacity determines on how many messages are retained in the transaction log and how many will be discarded.
The NAS determines that the storm has died down based on same logic i-e 3000 msg/5 min, and when this condition is not true anymore then it will return to a normal state. But, keep in mind these times are asymmetric. If you had a storm of 2990 alarms in the first 10 seconds then 10 more alarms occur at 4:50 seconds, the storm will be over 10 seconds after it started. This is because the arrival time of the first batch was heavily biased to the start of the storm.
That is the duration for quarantined messages to be published back to the nimsoft (NimBUS). It is like the samples value in the cdm probe - when the storm dies down. It is a sliding window.