We have problems that often the Threshold Monitoring xml changes to false and stops generating alarms.
Release : 3.7
Component : IM Reporting / Admin / Configuration
Take Action If Threshold Evaluations Are Suspended
Try to correlate the change in performance to configuration changes in the system.
Reduce the overall number of active event rules. Turn off event rules one at a time.
Check the performance after you turn off each rule before turning off another rule.
Reduce the overall number of active event rules that have windows greater than 300 seconds.
Reduce the number of Violation event conditions within event rules.
Reduce the number of event rules that use a condition type of Standard Deviation.
Verify that only required collections are applied to the monitoring profile or threshold profiles that contains event rules.
Verify that only required devices are contained within collections that are associated with these monitoring profiles or threshold profiles.
Based on the RED line, we check your threshold profiles and you have profiles with Window value above 300 seconds, so we adjusted the Threshold Limiter from 15 min to 30 min
curl -v -H "Content-Type:application/xml" -X PUT http://18.104.22.168:8581/rest/thresholdmonitoring/config/3609 --data "<ThresholdMonitoringConfiguration version='1.0.0'><RecoveryIntervalInMinutes>30</RecoveryIntervalInMinutes></ThresholdMonitoringConfiguration>"