Linux VM shows 100% CPU utilization in vCenter while guest OS reports idle
search cancel

Linux VM shows 100% CPU utilization in vCenter while guest OS reports idle

book

Article ID: 324263

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
Red Hat VM shows 100% CPU usage in esxtop and vCenter performance charts, while only idle CPU usage within the GuestOS tools.

Environment

Guest Operating Systems
Red Hat Enterprise Linux (RHEL) 8.4 and above
Ubuntu (LTS and Server versions)
SUSE Linux Enterprise Server (SLES) (All recent versions)
Linux Kernel 4.18.0-305.el8 & above

Cause

Constantly high due to Parameters "idle=poll" and "intel_idle.max_cstate=0"
 
  • The idle=poll and max parameter does not allow the CPU to Reset, hence the CPU Utilization is constantly high
    Low Latency Performance Tuning for Red Hat Enterprise Linux 7 (Look for--> idle=poll)
  • idle=poll is a way to reduce number of scheduler calls and IPIs. There is a power/cooling cost trade-off with idle=poll. This keeps processors at their maximum frequency and c-state and requires a reboot. The side-effect of this is that the CPUs may not have the thermal headroom to enter turbo frequencies which can affect performance.
  • "On a physical system, idle=poll prevents the CPUs from entering power-saving states (C states) since when the CPU transitions into an idle state, a busy computation is entered. On a VM, it does nothing to control host CPU C states, but it does have the unwanted side effect of causing the host CPUs in use by the guest VM to operate at 100% capacity."

Resolution

Remove the following parameters at Guest OS Level

  1. idle=poll  
    intel_idle.max_cstate=0
  2. After removing these parameters, reboot the VM for the changes to take effect.

Once the VM restarts, CPU utilization should return to normal.

Additional Information

These kernel parameters  are part of the core Linux kernel and are available and used on virtually all major Linux distributions, including Debian, Ubuntu, Fedora, SUSE, and Arch Linux ETC....
The kernel parameters idle=poll and intel_idle.max_cstate=0 are generic Linux kernel tuning options. If these are applied to any modern Linux distribution (e.g., Debian, Fedora, Arch) running as a VM, the same 100% host CPU utilization symptom will occur.