The CPU usage of VSCSISharedVdMainWorld is constantly at 100%
search cancel

The CPU usage of VSCSISharedVdMainWorld is constantly at 100%

book

Article ID: 405226

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • The CPU usage of VSCSISharedVdMainWorld is consistently at 100%. You can check the CPU usage of VSCSISharedVdMainWorld by the following method:
    1. Connect to ESXi via SSH and run esxtop command.
    2. Type "e", "1" and "Enter" in sequence. This displays CPU statistics for worlds in Group ID 1 (system).
    3. Type "g", then type "VSCSISharedVdMainWorld", and press Enter.
      If VSCSISharedVdMainWorld exists, the CPU statistics for that world will be displayed at the top of the output. From the %USED or %RUN value, you can confirm that this world’s CPU usage is close to 100%.



  • Even though the virtual machines on the host are mostly idle, one PCPU consistently shows 100% usage. In this example, the vSphere Client performance chart indicates that PCPU 13 is running at nearly 100% CPU utilization.



  • A virtual machine with a Clustered VMDK has been powered on on this ESXi. Please refer to Microsoft Windows Server Failover Clustering (WSFC) with shared disks on VMware vSphere 7.x: Guidelines for supported configurations for information about clustered VMDKs.

Note:
Before proceeding with troubleshooting this issue, please confirm that the virtual machine running on the host is not consuming PCPUs. This problem can occur even when the virtual machine’s CPU usage is mostly idle.

Environment

VMware vSphere ESXi 8.0

Cause

This is a known issue in ESXi. Due to a rare timing problem, VSCSISharedVdMainWorld does not properly enter an idle state and continuously consumes CPU.

VSCSISharedVdMainWorld is a world that provides the functionality for clustered VMDKs. |


To verify if the Process utilizing the CPU running at high utilization :
1. Go to ESXi > Monitoring > Advanced > CPU usage %
2. Sort by "latest" and check the PCPU that is running at the high usage % for the longest time.
3. Run the following command on an SSH Session to the ESXi host in question after replacing <pcpu number> with the required number :
sched-stats -t cpu | grep -vi 'WAIT' | grep -vi 'idle' | awk '$1 > 0 && $19 == "<pcpu number>"'
4. Review the process in the running state.
5. If there are multiple processes showing the same behavior , we can compare the outcome on each host.
6. If the PCPU running at high utilization changes, check the processes listed on other PCPUs.

For example, if the VSCSISharedVdMainWorld was the cause of the CPU usage on the PCPU identified , you will see an output as shown below where 43 is the PCPU number : 

Resolution

Broadcom is aware of this issue and is planning to release a patch in a future release. Currently, there is no workaround available. The clustered VMDK continues to function normally even under conditions of high CPU usage.