ESXi not responding with error "PCPU XX didn't have a heartbeat for 7 seconds. NMI IPI: RIPOFF(base)"
search cancel

ESXi not responding with error "PCPU XX didn't have a heartbeat for 7 seconds. NMI IPI: RIPOFF(base)"

book

Article ID: 427207

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • ESXi host is "non-responsive" in vCenter Server, preventing its connection and rendering its running VMs inaccessible.
  • /var/run/log/vmkernel.log contains a backtrace similar to this example:
    WARNING: Heartbeat: 822: PCPU XX didn't have a heartbeat for 7 seconds; *may* be locked up.
    ALERT: NMI: 690: NMI IPI: RIPOFF(base):RBP:CS [0x35f010(0x42001b000000):0x1:0xf48] (Src 0x1, CPU35)
    
    PCISbdfMap_Find@vmkernel#nover+0x48 stack: 0x42001b071480
    PCIPassthruGetVmmOwnerWorld@(pciPassthru)#<None>+0x17 stack: 0xf48
    VMKPCIPassthru_DevInUse@vmkernel#nover+0x13 stack: 0x453959f1b6a5
    PCIVGA_KernelConsoleEnabled@vmkernel#nover+0x15 stack: 0x41ffdb151d68
    SVGAConsoleClearRender@vmkernel#nover+0x36 stack: 0x7e
    SVGAConsoleClear@vmkernel#nover+0x177 stack: 0x43209600b2ff
    TermPutc@vmkernel#nover+0x120b stack: 0x25798100e308b46
    Term_Putb@vmkernel#nover+0x52 stack: 0x453959f1b89c
    TTYWriteToGeneric@vmkernel#nover+0x13a stack: 0x1
    UserTeletypeWriteBuffer@vmkernel#nover+0x6c stack: 0x430109c01438
    UserTeletypeWriteInt@vmkernel#nover+0x1eb stack: 0x0
    TTYWriteMethod@vmkernel#nover+0x36 stack: 0x430ac5eea550
    CharDriverAsyncIO@vmkernel#nover+0xf5 stack: 0x430c9aa04020
    FDS_AsyncIO@vmkernel#nover+0x6a3 stack: 0x453959f1ba70
    FDS_DoSyncIO@vmkernel#nover+0xf4 stack: 0x9c40
    DevFSFileIO@vmkernel#nover+0x38f stack: 0x4308ec0045c0
    FSSVec_FileIO@vmkernel#nover+0x20 stack: 0x1
    UserChardevIO@vmkernel#nover+0xfe stack: 0x954
    UserChardevWrite@vmkernel#nover+0x1f stack: 0x43209600bc44
    UserVmfs_Writev@vmkernel#nover+0x119 stack: 0x1874
    LinuxFileDesc_Write@vmkernel#nover+0xe5 stack: 0x45d9821fc650
    User_LinuxSyscallHandler@vmkernel#nover+0x1a4 stack: 0x0
    gate_entry@vmkernel#nover+0x68 stack: 0x0

Environment

VMware vSphere ESXi

Cause

The PCPU has become unresponsive, accompanied by a Non-Maskable Interrupt (NMI) — a pattern commonly indicative of an underlying hardware issue.

Resolution

Please contact the hardware vendor to diagnose the hardware, especially around the PCPU.