ESXi hosts PSODs Frequently with Varying stacks
search cancel

ESXi hosts PSODs Frequently with Varying stacks

book

Article ID: 436598

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

ESXi host experiences recurring Purple Screen of Death (PSOD) events.

  • The crashes occur across different workloads and may display varying error signatures, including:
    • PF Exception 14` (Page Fault)
    • Recursive panic` (cpu ##, world #####, depth x)
    • PCPU ##: no heartbeat (0/3 IPIs received)`
  • Analysis of the /var/run/log/LogEFI.log reveals that the multiple PSODs have occurred historically and the failures consistently occur on specific physical cores. 

YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)ESC[45mESC[33;1mVMware ESXi 8.0.3 [Releasebuild-24626799 x86_64]ESC[0m
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)cr0=0x80010031 cr2=0x4f6d4d5024 cr3=0x92a8f02000 cr4=0x140768
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)FMS=1a/11/0 uCode=0xb101028
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)Code start: 0x420015e00000 VMK uptime: 5:17:16:58.379
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79b1b0:[0x420015f7bbc0]PanicvPanicInt@vmkernel#nover+0x20c stack: 0x7
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79b260:[0x420015f7c33c]Panic_NoSave@vmkernel#nover+0x4d stack: 0x453cde79b2c0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79b2c0:[0x420015f7c521]Panic_OnAssertAt@vmkernel#nover+0xba stack: 0x2f1f00000000
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79b340:[0x4200164a25e7]Int6_UD2Assert@vmkernel#nover+0x260 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79b370:[0x42001649b0c6]gate_entry@vmkernel#nover+0xa7 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79b438:[0x4200164cf938]CpuSched_PreemptionPointUncond@vmkernel#nover+0x14 stack: 0x6a8
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79b440:[0x4200163f8d14]UserMem_HandleMapFault@vmkernel#nover+0x20f5 stack: 0x4317f420f240
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79be10:[0x42001642ea84]User_ArchExceptionHandleFault@vmkernel#nover+0x171 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79bec0:[0x4200163d08bd]User_Exception@vmkernel#nover+0x96 stack: 0x453cde79bf40
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79bf20:[0x4200164a1472]Int14_PageFault@vmkernel#nover+0x1a7 stack: 0x3
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)0x453cde79bf40:[0x42001649b0c6]gate_entry@vmkernel#nover+0xa7 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:9902404)base fs=0x0 gs=0x42006e000000 Kgs=0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)ESC[45mESC[33;1mVMware ESXi 8.0.3 [Releasebuild-24626799 x86_64]ESC[0m
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)cr0=0x80050033 cr2=0x18 cr3=0x12261965000 cr4=0x150660
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)FMS=1a/11/0 uCode=0xb10104e
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)frame=0x453cfba9bde0 ip=0x42000c5b02d3 err=0x2 rflags=0x10206
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)rax=0x42000c5b02d3 rbx=0x1 rcx=0x42000c85fb1d
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)rdx=0x1 rbp=0x42000c85fb1d rsi=0x2
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)rdi=0x8628a0 r8=0x2 r9=0x7ffffffe000
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)r10=0x0 r11=0x10 r12=0x453cf4c1f100
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)r13=0x2 r14=0x453cfba9bfdc r15=0xfffffffffc407d60
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)Code start: 0x42000c400000 VMK uptime: 6:17:55:28.480
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)0x453cfba9bea0:[0x42000c5b02d3]WorldFindInt@vmkernel#nover+0x17 stack: 0x453cf4c1f100
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)0x453cfba9bed0:[0x42000c85fb1c]CpuSchedActionNotifyTraditionalVcpuid@vmkernel#nover+0x29 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)0x453cfba9bee0:[0x42000c85feb3]CpuSched_ActionNotifyVcpuid@vmkernel#nover+0x38 stack: 0x1c
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)0x453cfba9bf00:[0x42000caad69e]VMMVMKCall_Call@vmkernel#nover+0x103 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)0x453cfba9bfd0:[0x42000caaa0d0]VMKVMM_ArchEnterVMKernel@vmkernel#nover+0x21 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)0xfffffffffc407cd8:[0xfffffffffc067f5a] (vmm64)
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)0xfffffffffc407d60:[0xfffffffffc068335]
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)0x8:[0x10]
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu185:8792228)base fs=0x0 gs=0x42006e400000 Kgs=0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)ESC[45mESC[33;1mVMware ESXi 8.0.3 [Releasebuild-24626799 x86_64]ESC[0m
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)cr0=0x8001003d cr2=0x18 cr3=0x7040e000 cr4=0x14016c
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)FMS=1a/11/0 uCode=0xb10104e
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)frame=0x453ce171bb60 ip=0x42000f05bdaa err=0x2 rflags=0x10246
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)rax=0x42000f05bdaa rbx=0x453ce171bdd0 rcx=0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)rdx=0x0 rbp=0x0 rsi=0x41ffc7c00cae
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)rdi=0x4305a9e02330 r8=0x1d28078 r9=0x433e8f0315a0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)r10=0x45c515475148 r11=0x45c515475168 r12=0x1
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)r13=0x43096a6fac40 r14=0x4305aa5b2ac0 r15=0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)Code start: 0x42000ee00000 VMK uptime: 11:07:20:45.041
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)0x453ce171bc20:[0x42000f05bdaa]IOChain_Resume@vmkernel#nover+0x2de stack: 0x6
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)0x453ce171bcd0:[0x42000f06cae1]Port_InputResume@vmkernel#nover+0x86 stack: 0x1
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)0x453ce171bd20:[0x42000f0b1461]Vmxnet3VMKDevTQDoTx@vmkernel#nover+0x23a stack: 0x430a79c06d40
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)0x453ce171bee0:[0x42000f0bcb88]Vmxnet3VMKDev_AsyncTx@vmkernel#nover+0xb9 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)0x453ce171bf50:[0x42000f126ba5]NetWorldPerVMCB@vmkernel#nover+0x1aa stack: 0x430395ff7188
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)0x453ce171bfe0:[0x42000f4d67b2]CpuSched_StartWorld@vmkernel#nover+0xbf stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)0x453ce171c000:[0x42000ef44cef]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:30572664)base fs=0x0 gs=0x42006e000000 Kgs=0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)ESC[45mESC[33;1mVMware ESXi 8.0.3 [Releasebuild-24626799 x86_64]ESC[0m
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)cr0=0x80010031 cr2=0xb76d449f60 cr3=0x188bf1f4000 cr4=0x140768
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)FMS=1a/11/0 uCode=0xb10104e
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)Code start: 0x420031e00000 VMK uptime: 21:07:10:17.789
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09aec0:[0x420031f7bbc0]PanicvPanicInt@vmkernel#nover+0x20c stack: 0x2
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09af70:[0x420031f7c33c]Panic_NoSave@vmkernel#nover+0x4d stack: 0x453d3b09afd0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09afd0:[0x420031f7c521]Panic_OnAssertAt@vmkernel#nover+0xba stack: 0x2e6f00000000
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b050:[0x4200324a25e7]Int6_UD2Assert@vmkernel#nover+0x260 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b080:[0x42003249b0c6]gate_entry@vmkernel#nover+0xa7 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b140:[0x4200324d9867]CpuSched_YieldThrottled@vmkernel#nover+0x11f stack: 0x1000
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b170:[0x420031e6954b]VisorFSTarDoIOInt@vmkernel#nover+0x10c stack: 0x9316bf4000
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b1e0:[0x420031e62ab2]VisorFSObjDoIOInt@vmkernel#nover+0xcb stack: 0x69c5171f69c5171f
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b250:[0x420031e5e7b0]VisorFSFileIO@vmkernel#nover+0x1dd stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b2b0:[0x420031e42038]FSSVec_FileIO@vmkernel#nover+0x21 stack: 0x867e9799
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b2d0:[0x4200324bc52b]FSSFileIO@vmkernel#nover+0x17c stack: 0x190
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b330:[0x4200324bc737]FSS_SGFileIO@vmkernel#nover+0x3c stack: 0xbad000e
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b380:[0x4200323e4e70]UserFileReadMPN@vmkernel#nover+0xf5 stack: 0x2da618b4e8
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09b440:[0x4200323f7f7d]UserMem_HandleMapFault@vmkernel#nover+0x135e stack: 0x435659c13650
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09be10:[0x42003242ea84]User_ArchExceptionHandleFault@vmkernel#nover+0x171 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09bec0:[0x4200323d08bd]User_Exception@vmkernel#nover+0x96 stack: 0x453d3b09bf40
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09bf20:[0x4200324a1472]Int14_PageFault@vmkernel#nover+0x1a7 stack: 0x3
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)0x453d3b09bf40:[0x42003249b0c6]gate_entry@vmkernel#nover+0xa7 stack: 0x0
YYYY:MM:DDTHH:MM:SS In(14) LogEFI: cpu184:66656291)base fs=0x0 gs=0x42006e000000 Kgs=0x0

Environment

ESXi 8.x 

ESX 9.x 

Cause

The issue is identified as a hardware-level failure associated with specific physical CPU cores. 

 

Resolution

 Contact server hardware vendor to report a suspected hardware failure on the identified physical cores and perform hardware diagnostics.