CPU usage increase after update to ESXi 7.0.3 or later

search cancel

CPU usage increase after update to ESXi 7.0.3 or later

book

Article ID: 335065

calendar_today

Updated On: 09-13-2024

Products

VMware vSphere ESXi

Issue/Introduction

On ESXi hosts with enabled HyperThreading (HT / SMT), there may be increased CPU usage and demand after upgrading to ESXi 7.0 U3 (or later) compared to any previous release. CPU utilization and contention metrics (e.g. Ready Time) do not increase. This is more prevalent on highly utilized hosts with workloads where vCPUs migrate often between PCPUs, e.g. context switch heavy VDI or Citrix workloads.

Environment

VMware vSphere ESXi 7.0.3

Cause

To elaborate on the cause and impact, we have to differentiate between CPU usage and CPU utilization as they are different metrics.

CPU utilization refers to whether a PCPU is idle or not, i.e. a PCPU is utilized when anything but its idle thread is executing. At the host level, when SMT is enabled, 50% average CPU utilization can mean that all cores are being utilized by one thread or that half of the cores are utilizing both their threads. The throughput of two hosts with these two similar looking utilization examples could be very different.

CPU usage is a qualitative metric that incorporates more than just utilization, like the frequency of the underlying CPU or whether a world has to share the core with another world at the time of execution. An example how frequency can cause a difference between usage and utilization would be a 50% utilized PCPU at 60% CPU usage because of 20% Turbo Boost above the CPU's base frequency. For HyperThreading, ESXi assumes a flat 25% throughput benefit for the core when two threads utilize it at the same time.

ESXi 7.0 U3 fixed a long-existing issue in which CPU usage could be under-accounted. Before the fix, when a vCPU world migrated between different PCPUs while the other PCPU of the core was also being utilized, a rare race condition could result in some of that utilization not being charged to the vCPU world and the PCPU as usage.

Resolution

None. This is expected behavior if the same workload was affected by CPU usage under-accounting in the past.

Additional Information

Impact/Risks:
There is no negative impact on host density or performance, this is a change in reported CPU usage only. If there is increased CPU utilization, reduced performance, more ready time or co-stop this KB does not apply or only partially for the relative increased in CPU usage.

Note: Increased CPU utilization (and subsequently usage) between different ESXi releases can have many reasons:

More efficient storage or network IO handling can reduce I/O wait and keep the CPU busier
Improved world scheduling and reduced ready or co-stop time
Other changes unrelated to the ESXi upgrade might also contribute, e.g. guest OS or application updates that were done during the same timeframe

Feedback

Was this article helpful?

thumb_up Yes

thumb_down No