Users may report that alerts for OpenShift metrics (e.g., QOS_OPENSHIFT_NODE_CPU_USAGE_PERCENTAGE, QOS_OPENSHIFT_NODE_MEMORY_PRESSURE) are not triggering in DX Unified Infrastructure Management (UIM) even when manual CLI checks (like oc describe node) suggest a threshold breach.
Specifically, the following Quality of Service (QOS) metrics may appear inconsistent:
QOS_OPENSHIFT_NODE_SPEC_UNSCHEDULABLEQOS_OPENSHIFT_NODE_NETWORK_UNAVAILABLEQOS_OPENSHIFT_NODE_MEMORY_PRESSUREQOS_OPENSHIFT_NODE_CPU_USAGE_PERCENTAGEThe discrepancy typically arises from how the opensift probe cluster-info pod calculate percentages compared to standard OpenShift CLI tools:
oc describe node <node_name>, they will match for any node.containerinfo which is running in the openshift probe daemonset app-container-monitortop command or standard OpenShift descriptions.Confirm that the threshold is based on actual usage rather than Request/Limit ratios. If you require alerts based on Request/Limit ratios, ensure the specific QOS for those attributes is being monitored.
Ensure the thresholds are set correctly for the metric type:
0. Set threshold to > 0 to alert when the node becomes unschedulable.0. Set threshold to 1 (or > 0) to alert on network issues.1.To verify what the probe is actually recording:
Raw.If metrics are consistently reporting as 0.00 despite known activity, update the image version in your values.yaml to include the latest metric processing fixes.
Configure Openshift Monitoring Best Practices | Known issues | Requirements