I set up a threshold profile for CPU utilization, and am wanting to look at the average CPU utilization by device. I used the "aggregate by device" option:
But, the average CPU utilization during the time frame did not go above 70%. We are wanting to alert on average utilization by device, is there something set up incorrectly here?
For the device level we see this during the time frame the Event raised when it should not have. Note a lack of >70% data points.
This is the same data broken out by component item. Two items data violates the threshold, but the three together result in data <70%.
All supported DX NetOps Performance Management releases
Vendor Certification Priority (VCP) Grouping is in use and needs to be accounted for when using an Event Threshold Rule that is aggregating data at the device level.
In this scenario it was the CPU Metric Family set up with VCP Grouping. It showed support for two different Vendor Certification (VC) entries supported against the CPU MF. This can be seen when using VCP Grouping for the Memory MF as well.
In this scenario the Event Threshold is performing evaluations against the different VCs individually. Two items are polled by one VC, the two with data >70%. One item is polled by the second VC, the one with data <70%.
An Event was correctly raised for the two items from one VC with data violating the threshold. If the other item was also violating the threshold, it also would have raised an Event.
If the intent is to maintain the VCP Grouping configuration we'd want to make the thresholding evaluation apply to all CPU items at once. To do so would require using the Managed Aggregated Components feature, as detailed in our TechDocs:
TechDocs : DX NetOps 24.3 : Manage Aggregated Components
Create an aggregated component using a single device with VCP Grouping that shows the same symptoms. Adding it to its own aggregation would make the threshold evaluations review the CPUs from the diff VCs together as a whole for aggregated rules.
Note that when adding multiple devices to a single component aggregation config, it'd eval the data from all the components from those devices at once.
Alternative solution, if VCP Grouping isn't needed, is to break the VCP Grouping. That means reversing the changes made using the steps outlined in our TechDocs:
TechDocs : DX NetOps 24.3 : Manage Vendor Certification Priorities