ESXi host crashes with a PSOD error with backtrace event "ProcFSRemoveNode@(procfs)"
search cancel

ESXi host crashes with a PSOD error with backtrace event "ProcFSRemoveNode@(procfs)"

book

Article ID: 344752

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
ESXi server will crash with a PSOD error with similar entries as below

Panic Details: Crash at YYYY-MM-DDT19:15:23.723Z on CPU 14 running world 2097422. VMK Uptime:1:04:13:17.067
Panic Message: @BlueScreen: NMI IPI: Panic requested by another PCPU. RIPOFF(base):RBP:CS [0x8442bf(0x41803b400000):0x43320fa22010:0xfc8] (Src 0x4, CPU14)
 0x450a800e2d10:[0x41803b50ba45]PanicvPanicInt@vmkernel#nover+0x439 stack: 0x41803b896468, 0x41803b8963b0, 0x450a800e2db8, 0x430332ede988, 0x450a00000001
 0x450a800e2db0:[0x41803b50bcd1]Panic_WithBacktrace@vmkernel#nover+0x56 stack: 0x450a800e2e20, 0x450a800e2dd0, 0x1960, 0x1960, 0x8442bf
 0x450a800e2e20:[0x41803b508ac1]NMI_Interrupt@vmkernel#nover+0x3c2 stack: 0x0, 0xfc8, 0x2034312075706370, 0x6b636f4c6e697053, 0x756f206e69707320
 0x450a800e2ea0:[0x41803b544f0c]IDTNMIWork@vmkernel#nover+0x99 stack: 0x0, 0x0, 0x0, 0x0, 0x0
 0x450a800e2f20:[0x41803b546400]Int2_NMI@vmkernel#nover+0x19 stack: 0x0, 0x41803b563067, 0xfd0, 0xfd0, 0x0
 0x450a800e2f40:[0x41803b563066]gate_entry@vmkernel#nover+0x67 stack: 0x0, 0x0, 0xf, 0x430bff62e5f0, 0x0
 0x451b0871bdc0:[0x41803bc442bf]ProcFSRemoveNode@(procfs)#<None>+0x53 stack: 0x43320fa23ea0, 0x41803bb4e2bf, 0x43320fa2a8b0, 0x43320fa23e90, 0x43320fa22010
 0x451b0871bde0:[0x41803bb4e2be]UserProc_Remove@(user)#<None>+0xd7 stack: 0x43320fa22010, 0xffffffff, 0x451b40f23000, 0x430bec638070, 0x43320fa22010
  

Saved backtrace from: pcpu 14 SpinLock spin out NMI
 0x451b0871bdc0:[0x41803bc442be]ProcFSRemoveNode@(procfs)#<None>+0x53 stack: 0x43320fa23ea0
 0x451b0871bde0:[0x41803bb4e2be]UserProc_Remove@(user)#<None>+0xd7 stack: 0x43320fa22010
 0x451b0871be30:[0x41803bb78057]UserCartel_CartelCleanup@(user)#<None>+0x24 stack: 0xffffffff
 0x451b0871be70:[0x41803bb4909c]UserModuleTableRun@(user)#<None>+0x69 stack: 0x0
 0x451b0871bec0:[0x41803bb49a80]User_WorldCleanup@(user)#<None>+0xcd stack: 0x41803b8a7e40
 0x451b0871bf00:[0x41803b4ee7ca]InitTable_Cleanup@vmkernel#nover+0x27 stack: 0x43026e269070
 0x451b0871bf20:[0x41803b5407c1]World_TryReap@vmkernel#nover+0x336 stack: 0x0
 0x451b0871bf90:[0x41803b50e012]ReaperWorkerWorld@vmkernel#nover+0xc7 stack: 0x0

Environment

VMware vSphere 6.7.x

Cause

The issue occurred because of the CPU going into deadlock situation. The ProcFSRemoveNode runs to a loop which spin-waits, this leads to the PSOD crash

Resolution

The fix of the issue is available in ESXi 6.7 P02, Release Number is ESXi670-202004002, Build number is 16075168.

Workaround:
No workaround available