ESXi host crashes with PSOD (purple screen of death) "Unexpected runqueue state encountered" on same PCPUs
search cancel

ESXi host crashes with PSOD (purple screen of death) "Unexpected runqueue state encountered" on same PCPUs

book

Article ID: 368557

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

ESXi host crashes with PSOD (purple screen of death) "Unexpected runqueue state encountered" on same physical CPUs. This often correlates with an NMI triggered by a heartbeat timeout on a specific PCPU.

Heartbeat NMI: Non-Maskable Interrupt (NMI) triggered by a heartbeat timeout on physical CPU (PCPU).

Hardware Logs: * System Event Log (SEL) or LogEFI may show "System Firmware Progress Unspecified" or "No heartbeat" messages.

The PSOD Stack contains one or more of the following strings:

CpuSched_VcpuRunStateChange@vmkernel
CpuSchedVcpuMakeReady@vmkernel
CpuSchedWakeupWorld@vmkernel
CpuSchedWakeupWorldInt@vmkernel
CpuSchedActionNotifyTraditionalVcpuid@vmkernel
CpuSched_ActionNotifyTraditionalVCPUSubset@vmkernel
CpuSchedActionNotifyHierarchical@vmkernel
CpuSched_ActionNotifyVCPUs@vmkernel
VMMVMKCall_Call@vmkernel
VMKVMM_ArchEnterVMKernel@vmkernel#
cpu81:4001794)CpuSched: 5595: Unexpected runqueue state encountered!
cpu80:2098638)CpuSched: 5595: Unexpected runqueue state encountered!

Prior to the crash vmkernel logged errors similar to the ones below repeatedly, as can be seen in /var/run/log/vmkernel.*:

[YYYY-MM-DDTHH:MM:SS] cpuXX:4001794)ALERT: CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuXX:2098638)ALERT: CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuYY:4001794)ALERT: CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuYY:4001794)CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuXX:2098638)CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuXX:4001794)CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuYY:2098638)CpuSched: 5595: Unexpected runqueue state encountered!
[YYYY-MM-DDTHH:MM:SS] cpuYY:2105937)CpuSched: 5595: Unexpected runqueue state encountered!

 



Environment

VMware vSphere ESXi 7.x

VMware vSphere ESXi 8.x

Cause

  • This is a hardware issue, usually caused by a faulty physical CPU or socket

 

Resolution

  • To confirm if a specific processor is faulty, swap the physical CPUs between the sockets and observe if the crash follows the suspected CPU.
  • As a reference, for a host with two physical processors (packages) each having 10 cores, logical CPUs 0–9 belong to Processor 1, and logical CPUs 10–19 belong to Processor 2.
  • Engage your hardware vendor to perform complete hardware diagnostics.
  • If the crash occurs again on the same physical CPU after the swap, this confirms the processor is at fault and should be replaced.