This is a known issue affecting ESXi 5.5.
This is resolved in ESXi 5.5 Patch 4.
For additional troubleshooting, you need to identify the issue correctly.
To troubleshoot this issue, run these commands:
esxtop command (Part 1)
- While the virtual machine is unresponsive, run esxtop from an SSH session to the ESXi host in which the affected virtual machine is registered.
- Find the virtual machine and expand its group. Press e to enter world ID.
- You must see all vpcu worlds with 100% VMWAIT and one of those vcpu worlds with 100% SWPWT (or very close to it, such as 99%):
For example:
ID GID NAME NWLD %USED %RUN %SYS %WAIT %VMWAIT %RDY %IDLE %OVRLP %CSTP %MLMTD %SWPWT
1171868 2220151 vmx 1 0.01 0.00 0.00 100.00 - 0.00 0.00 0.00 0.00 0.00 0.00
1171870 2220151 vmast.1171869 1 0.07 0.06 0.00 100.00 - 0.00 0.00 0.00 0.00 0.00 0.00
1171874 2220151 vmx-vthread-7:A 1 0.00 0.00 0.00 100.00 - 0.00 0.00 0.00 0.00 0.00 0.00
1171875 2220151 vmx-vthread-8:A 1 0.00 0.00 0.00 100.00 - 0.00 0.00 0.00 0.00 0.00 0.00
1171876 2220151 vmx-vthread-9:A 1 0.00 0.00 0.00 100.00 - 0.00 0.00 0.00 0.00 0.00 0.00
1171877 2220151 vmx-mks:AGPRODS 1 0.01 0.01 0.00 100.00 - 0.00 0.00 0.00 0.00 0.00 0.00
1171880 2220151 vmx-svga:AGPROD 1 0.00 0.00 0.00 100.00 - 0.00 0.00 0.00 0.00 0.00 0.00
1171881 2220151 vmx-vcpu-0:AGPR 1 0.00 0.00 0.00 100.00 100.00 0.00 0.00 0.00 0.00 0.00 100.00
1171882 2220151 vmx-vcpu-1:AGPR 1 0.00 0.00 0.00 100.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00
1171883 2220151 vmx-vcpu-2:AGPR 1 0.00 0.00 0.00 100.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00
1171884 2220151 vmx-vcpu-3:AGPR 1 0.00 0.00 0.00 100.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00
ps command (Part 2)
- Run this command:
ps -s | grep <vm_name>
- In the output, you should see one vmm world in the WAIT SWPC state and the other(s) in the WAIT SEMA state.
For example:
36350 vmm0:HRWeb 36349 0 V WAIT SEMA
36352 vmm1:HRWeb 36349 0 V WAIT SWPC
Note: The world ID of the WAIT SWPC world in Part 2 should agree with the 100% SWPWT world ID in Part 1. If the virtual machine is in this state for an extended period of time, you are likely hitting this issue.
To work around the issue, halt the virtual machine process to recover it.
For more information, see Powering off an unresponsive virtual machine on an ESX host (1004340).