Host is exceeding 100% CPU and vms vmotion slowly or unable to
search cancel

Host is exceeding 100% CPU and vms vmotion slowly or unable to

book

Article ID: 412235

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vCenter Server

Issue/Introduction

  • Host is exceeding 100% CPU
  • Unable to vMotion servers Along with this we are
  • Seeing bootstrap errors on some VM's that will NOT boot.
  • Networking time out issue

Cause

  • DRS (Distributed Resource Scheduler) aggressively overloading host 16 with large, high-CPU VMs, failing to balance workloads effectively across the cluster, despite other hosts being underutilized
  • Network and storage heartbeats intermittently failing, which is suspected to be a symptom of CPU saturation on the host rather than direct hardware faults.

Resolution

  1. The host will need to be place back in to Maintenance Mode.
  2. Allow the VM to finish migrating to other hosts.
  3. Place DRS is manual mode with prevent the movement of VMs from host to host but allows the DRS to do it calculation to see what VM it would like to move over to the host.
  4. Take host out of Maintenance mode.
  5. Under Cluster -> Monitor -> vSphere DRS -> Recommendation
    1. Look at the number of provisioned CPUs as whole it wants to move. 
  6. Mnually move VMs based on the DRS recommendation. A couple at a time.
  7. Watch to see what the host CPU is doing and check to see what DRS is suggesting to move next.
  8. Watch the sum of CPU utilization under Cluster -> Monitor -> vSphere DRS -> CPU Utilization.
  9. Watch the sum of Memory utilization under Cluster -> Monitor -> vSphere DRS -> Memory Utilization.
  10. After the hosts are balanced turn DRS back to Automatic