Imbalanced CPU usage and increased contention on ESXi hosts with AMD EPYC CPUs
search cancel

Imbalanced CPU usage and increased contention on ESXi hosts with AMD EPYC CPUs

book

Article ID: 307072

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
On ESXi hosts with AMD EPYC Processors like Naples (Zen), you might experience the following symptoms:
  • Some ranges of PCPUs are highly utilized while others are not. 
  • Increased Ready / CoStop time on some VMs. 


Environment

VMware vSphere ESXi 6.5
VMware vSphere ESXi 6.7

Cause

This issue occurs when the CPU scheduler considers the relationship among scheduling contexts and places them closely within a single last-level cache. Such relationships are not limited to virtual CPUs, and can also be established with I/O contexts. This placement optimization minimizes inter LLC communication overhead in general cases.

However, on AMD EPYC processors, each physical NUMA node may consists of multiple last-level caches. In such cases, the scheduler may move contexts with the same relationship toward a subset of last-level cache(s) while leaving the other subset of last-level cache(s) in the same NUMA node relatively idle. This concentrated placement may cause non-negligible ready time in certain cases.

Resolution

This issue is resolve in ESXi 6.5 U3
This issue is resolve in ESXi 6.7 U2

Note: While this also affects ESXi 6.0 and earlier, no AMD EPYC system is supported for those release.

Workaround:
Currently there is no workaround.