Understanding new APMIA resource metrics
search cancel

Understanding new APMIA resource metrics

book

Article ID: 438621

calendar_today

Updated On:

Products

CA Application Performance Management (APM / Wily / Introscope) DX Operational Observability

Issue/Introduction

EPAgent, replaced by APMIA, has changed the resource metrics. Is the behavior below the correct and accurate representation of our system health based on the following points?
Averaging and Resolution: APMIA calculates CPU utilization as a weighted average over its sampling interval (regardless of whether the resolution is set to the 15s minimum or a higher value). This effectively filters out transient micro-spikes that do not impact the overall stability of the OS.

Normalized Capacity: It appears APMIA reflects the total capacity of the system (all cores combined). A 2% average suggests that the vast majority of the total processing power is idle, making momentary individual core spikes irrelevant to the overall system performance.

Performance Assessment: CPU is not a bottleneck and that these "spikes" previously seen in EPAgent were likely sub-second bursts that do not represent a real saturation of resources. Therefore, the current 2% reading is the metric we should trust for capacity planning and alerting.

Could you please confirm that it is correct to consider the system "healthy and quiet" at 2% CPU, and that we should not be concerned about the absence of the momentary spikes previously reported by EPAgent?
 

APMIA Resource Metrics

 

Environment

DX O2 SaaS release

Resolution

Yes, considering the system "healthy and quiet" at 2% average usage is the correct standard procedure. You should not be concerned about the absence of these momentary spikes. In fact, this "smoothing" is desirable to avoid false-positive alerts (the so-called "alert fatigue"). If the system were facing a real CPU bottleneck, the 15-second average would rise consistently, as the processor would be constantly busy, rather than just experiencing short spikes.

If you notice that application performance is degrading despite the CPU reporting 2%, then the bottleneck is not raw processing power, and you should look for metrics from other resources (such as disk I/O, memory, or network latency), which are components that APMIA also monitors.