Understanding "Peak" or "Worst" Performance Metrics in VMware VCF / Aria Operations

Products

VCF Operations

Issue/Introduction

The vSphere VM Performance List view (or similar dashboards) shows seemingly inflated or unusually high values for metrics such as Worst CPU Queue, Worst Disk Queue, Worst Disk Latency, or Worst TX Packet Drop.
When checking the same Virtual Machines (VMs) directly in vCenter Performance Charts or via esxtop, performance appears completely normal (e.g., CPU Ready and Co-Stop are near 0%).
Users report no actual performance degradation or negative end-user experience on the affected VMs.

Environment

VCF Operations 9.x

Aria Operations 8.x

Cause

This behavior is by design and occurs due to differences in data granularity, collection mechanics, and retention policies between VMware vCenter and VCF Operations (formerly Aria Operations).

1. 20-Second Micro-Spikes vs. 5-Minute Averages

The vCenter adapter collects data from vCenter every 5 minutes (the standard collection cycle). During this 5-minute window, the adapter retrieves 15 raw real-time samples taken at 20-second intervals by vCenter.

Standard Metrics: Displayed as an average of these 15 samples over the 5-minute mark.
Peak / "Worst" Metrics: Represent the absolute maximum value (or a formulaic calculation of the peak) recorded among those 20-second samples. They capture brief, transient micro-spikes that are completely smoothed out in standard average metrics.

2. vCenter Historical Data Rollup

A customer looking at vCenter charts later will likely not see these spikes. vCenter only retains real-time 20-second granularity data for a limited time (typically 1 hour). After that, vCenter rolls the data up into coarser historical averages (e.g., 5-minute, 30-minute, or hourly blocks), destroying the visibility of the original 20-second micro-spikes.

Because VCF Operations evaluates and logs the peak value during the live collection cycle, it permanently records that brief spike.

Metric Calculation Logic

Below is the calculation logic used by VCF Operations during a 5-minute collection cycle to derive these peak metrics from raw vCenter data:

Guest | Peak Disk Queue within collection cycle
$\text{Calculation: } \frac{\text{MAX}(\text{disk queue samples inside the collection cycle})}{100}$
Virtual Disk | Peak Latency within collection cycle (ms)
$\text{Calculation: } \text{peakReadLatency} + \text{peakWriteLatency}$
peakReadLatency: The absolute highest (maximum) amount of time it took to complete a single read I/O operation from the virtual disk during the specific collection interval. It is typically measured in milliseconds (ms).
peakWriteLatency: The absolute highest (maximum) amount of time it took to complete a single write I/O operation to the virtual disk during the specific collection interval. It is also measured in milliseconds (ms)
Network | Peak Transmitted Packet Dropped within collection cycle (%)

MAX(net.droppedTx_summation) / AVG(net.packetsTx_summation) × 100

(For more information on this specific behavior, see Broadcom KB 422942)

Resolution

No resolution or fix is required as the metrics are reporting accurately based on raw real-time data. Occasional, isolated micro-spikes are completely normal in healthy, active production environments.

If you want to view sustained performance trends rather than transient spikes, apply the following adjustments:

Use Interval Breakdown for Historical Timelines

Instead of relying on a single aggregated maximum value over a large time range (e.g., last 7 days), modify your reporting view:

Clone the Out-of-the-Box (OOTB) view (do not modify the default view directly).
Enable the Add Interval Breakdown option within the cloned view configuration.
This creates a separate row for each specified interval (such as every hour), transforming a flat summary into a historical timeline that makes it easier to distinguish isolated micro-bursts from genuine, sustained performance issues.