ESXi host crashes with PSOD (purple screen of death) "Unexpected runqueue state encountered" on same physical CPUs. This often correlates with an NMI triggered by a heartbeat timeout on a specific PCPU.
Heartbeat NMI: Non-Maskable Interrupt (NMI) triggered by a heartbeat timeout on physical CPU (PCPU).
Hardware Logs: * System Event Log (SEL) or LogEFI may show "System Firmware Progress Unspecified" or "No heartbeat" messages.
The PSOD Stack contains one or more of the following strings:
CpuSched_VcpuRunStateChange@vmkernel
CpuSchedVcpuMakeReady@vmkernel
CpuSchedWakeupWorld@vmkernel
CpuSchedWakeupWorldInt@vmkernel
CpuSchedActionNotifyTraditionalVcpuid@vmkernel
CpuSched_ActionNotifyTraditionalVCPUSubset@vmkernel
CpuSchedActionNotifyHierarchical@vmkernel
CpuSched_ActionNotifyVCPUs@vmkernel
VMMVMKCall_Call@vmkernel
VMKVMM_ArchEnterVMKernel@vmkernel#
cpu81:4001794)CpuSched: 5595: Unexpected runqueue state encountered!
cpu80:2098638)CpuSched: 5595: Unexpected runqueue state encountered!Prior to the crash vmkernel logged errors similar to the ones below repeatedly, as can be seen in /var/run/log/vmkernel.*:
[YYYY-MM-DDTHH:MM:SS] cpuXX:4001794)ALERT: CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuXX:2098638)ALERT: CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuYY:4001794)ALERT: CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuYY:4001794)CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuXX:2098638)CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuXX:4001794)CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuYY:2098638)CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuYY:2105937)CpuSched: 5595: Unexpected runqueue state encountered!
VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x