The device generated multiple alarms despite being continuously down for more than 2 hours.
We noted that these alarms cleared even though the established threshold was violated.
The best approach here is to extend the TOT and sliding window due to the high frequency of network monitoring like packet loss with a smaller time period - its just the nature of the beast and its frequency, just like latency, jitter, and packet loss. What if the issue flip-flops, starts and stops, comes and goes? It could be misleading if it were intermittent and the alarms were being auto-cleared and therefore regenerated.
Currently as configured, the MCS alarm policy for packet loss alarms is set to 3 min out 5 min for ToT - that is almost equivalent to immediate and with auto-clear enabled, its regenerating the alarms with new alarm ids in many cases. The purpose of ToT itself is to reduce the unwanted alarms and noise and only generate the alarm when an actual problem persists.
The total amount of time packet loss must be above the threshold within the sliding window to trigger an alert (e.g., 20 minutes), out of a sliding window of 30. Auto-clear option should remain enabled. Suggest gathering some data to make an informed decision by checking the latency behaviour over time using tracert/traceroute, or a ping script (ping -a <ip>).
Try 14 out of 15 min or 19 out of 20 min or 20 min out of 30 min-for the time frame but at least 3 monitoring intervals should be used as the environment is fluctuating or there are obviously some underlying network issues. The settings must be determined by data collection, assessment and decide upon monitoring requirements by the network admin/end user.