Intermittent hung state for ESXi hosts. No apparent changes in the environment or particular workload changes or increases. You may see a message similar to the following:
YYYY-MM-DDTHH:MM:SS In(14) vobd[2097787]: [cpuCorrelator] 14103515363517us: [vob.cpu.nmi.ipi.savebt] NMI IPI: RIPOFF(base):RBP:CS [0x42e95a(0x42000c800000):0x453a5649f100:0x748] (Src 0x1, CPU34)
VMware vSphere ESXi (All Versions)
Possible deadlock condition on the CPU. A deadlock against the CPU is typically access required by multiple sources that cannot be serviced due to latency, held locks or other CPU issues.
YYYY-MM-DDTHH:MM:SS Wa(180) vmkwarning: cpu34:2098260)WARNING: Lock: 1660: (held by 35: Spin count exceeded 1 time(s) - possible deadlock.YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu59:2098600)DOM: DOMServerCheckAndTraceCpuLockup:3396: CPU lockup: op 0x45da8d3686c0 writeEC on obj d92d1060-####-####-####-############ execution time 26341 ms exceeds threshold 500 ms
After the above timeout, we throw a back trace.
YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09b840:[0x42000c823e47]Lock_CheckSpinCount@vmkernel#nover+0x157 stack: 0x500000005YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09b890:[0x42000c9242bf]SP_WaitLockIRQ@vmkernel#nover+0xf0 stack: 0x420045c06880YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09b8e0:[0x42000c9243cd]SPLockIRQWork@vmkernel#nover+0x5e stack: 0x22YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09b900:[0x42000cc21f13]CpuSched_IdleHaltEnd@vmkernel#nover+0x40 stack: 0x453a1d09ba30YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09b940:[0x42000c8f8a00]IntrCookie_DoInterrupt@vmkernel#nover+0x4f5 stack: 0x0YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu71:2099673)osfs: OSFS_GetMountPointList:3748: mountPoints[0] inUse pid [ vsan], cid 521b604e51fade18-d15de08e81b9921aYYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09b9f0:[0x42000c8f8baf]IntrCookie_VmkernelInterrupt@vmkernel#nover+0x38 stack: 0xefYYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09ba10:[0x42000c96e7f2]IDT_IntrHandler@vmkernel#nover+0x97 stack: 0x0YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09ba30:[0x42000c9670b6]gate_entry@vmkernel#nover+0xa7 stack: 0x0YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09baf8:[0x42000c893798]Power_ArchPerformWait@vmkernel#nover+0xd4 stack: 0x420048801880YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09bb00:[0x42000c8938e9]Power_ArchSetCState@vmkernel#nover+0xba stack: 0x0YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09bb50:[0x42000cc263b1]CpuSchedIdleLoopInt@vmkernel#nover+0x292 stack: 0x0YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09bbc0:[0x42000cc2aa1c]CpuSchedDispatch@vmkernel#nover+0x1f21 stack: 0x452100000001YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09be00:[0x42000cc2b441]CpuSchedWait@vmkernel#nover+0x362 stack: 0x6fYYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09bf70:[0x42000d749c4a]NetSchedHClkSchedSysWorld@(netsched_hclk)#<None>+0x1d7 stack: 0x453a1761f000YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09bfe0:[0x42000cc2c149]CpuSched_StartWorld@vmkernel#nover+0xe2 stack: 0x0YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu34:2098260)0x453a1d09c000:[0x42000c8dbe7f]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0
Engage the server hardware vendor to review logs or run diagnostics pertaining to the CPU and other relevant hardware. Might also open a service request with VMware by Broadcom to review the ESXi host logs.