Discovered after deploying the 7.11hf1 vmware probe that the tot_rule_config queue backs up starting with the 3rd monitoring interval and the 8.56HF3 alarm_enrichment probe never catches up. This issue occurs with vmware probes configured to monitor relative large vCenters where Distributed Resource Scheduler (DRS) is configured to automatically move VMs around to other ESX hosts when additional resources are needed.
UIM/UMP 8.51 vmware: 7.11, 7.11HF1, and 7.11HF5 nas 8.56HF3
The vmware 7.11 and 7.11hf1 release are publishing every tot_rule_config message twice each monitoring interval when there is a change in the topology (addition or deletion of VMs, vmotion of exiting VMs from one ESX host to another, etc). One set of tot_rule_config messages are published to address deleted devices, the second for devices that were added. In very busy vCenters, topology changes can occur every monitoring interval and with a large number of monitored devices will result in a large number of tot_rule_config messages to be published every monitoring interval. When there are a large number of active alarms that the alarm_enrichment probe is managing for ToT interval compliance which are affected by any of the existing ToT rule configurations represented in the tot_rule_config messages, the alarm_enrichment probe must check all of these active alarms against the ToT rules to see if the rules apply. It appears that this check becomes very inefficient when there is a large number of active alarms causing the tot_rule_config cue to back up.