When testing path redundancy with the Emulex OneConnect 10GbE FCoE HBA, the ESXi host fails with a purple diagnostic screen and reports the error: PCPU 8: no heartbeat
search cancel

When testing path redundancy with the Emulex OneConnect 10GbE FCoE HBA, the ESXi host fails with a purple diagnostic screen and reports the error: PCPU 8: no heartbeat

book

Article ID: 344084

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

When testing path redundancy with the Emulex OneConnect 10GbE FCoE HBA, you experience these symptoms:

  • The ESXi host fails with a purple diagnostic screen
  • You see a backtrace similar to:

    cpu5:4101)@BlueScreen: PCPU 8: no heartbeat (but 2/2 IPIs received).
    cpu5:4101)Code start: 0x418007800000 VMK uptime: 13:02:32:23.771
    cpu5:4101)Saved backtrace from: pcpu 8 Heartbeat NMI
    cpu5:4101)0x41224b8079f0:[0x418007f17887]lpfc_sli4_fcf_rr_next_index_get@<None>#<None>+0x2e stack: 0x34b807a3
    cpu5:4101)0x41224b807bb0:[0x418007ec4160]lpfc_cmpl_els_flogi@<None>#<None>+0x147 stack: 0x410036002910
    cpu5:4101)0x41224b807be0:[0x418007ec3aa4]lpfc_cmpl_fabric_iocb@<None>#<None>+0xa3 stack: 0x410001419690
    cpu5:4101)0x41224b807d60:[0x418007f14015]lpfc_sli_sp_handle_rspiocb@<None>#<None>+0x854 stack: 0x410036001748
    cpu5:4101)0x41224b807dc0:[0x418007f1a7d2]lpfc_sli_handle_slow_ring_event_s4@<None>#<None>+0x1bd stack: 0x1000
    cpu5:4101)0x41224b807eb0:[0x418007ed5fb9]lpfc_do_work@<None>#<None>+0xb58 stack: 0x0
    cpu5:4101)0x41224b807f50:[0x418007ceb0ff][email protected]#9.2+0x1ce stack: 0x41224b807fa0
    cpu5:4101)0x41224b807fa0:[0x418007ce8596][email protected]#9.2+0xa1 stack: 0x412200000003
    cpu5:4101)0x41224b807ff0:[0x4180078a4fa3]vmkWorldFunc@vmkernel#nover+0x52 stack: 0x0
    cpu5:4101)0x41224b807ff8:[0x0]<unknown> stack: 0x0
    cpu5:4101)base fs=0x0 gs=0x418041400000 Kgs=0x0
    cpu5:4101)Heartbeat: 618: PCPU 8 didn't have a heartbeat for 21 seconds. *may* be locked up
    cpu8:4832)NMI: 1943: NMI IPI received. Was eip(base):ebp:cs [0x7178e9(0x418007800000):0x41224b8079f0:0x4010](Src 0x1, CPU8)
    cpu13:4109)Heartbeat: 618: PCPU 8 didn't have a heartbeat for 8 seconds. *may* be locked up
    cpu11:4858)Logs are stored on non-persistent storage. Consult product documentation to configure a syslog server or a scratch partition.
    cpu5:4101)Heartbeat: 618: PCPU 8 didn't have a heartbeat for 21 seconds. *may* be locked up
    cpu5:4101)Backtrace for current CPU #5, worldID=4101, ebp=0x412240147a68


Environment

VMware ESXi 4.1.x Installable
VMware vSphere ESXi 5.0
VMware ESX 4.1.x
VMware ESXi 4.0.x Installable
VMware ESX 4.0.x
VMware vSphere ESXi 5.5
VMware vSphere ESXi 5.1

Cause

This issue occurs due to a mismatch between the driver and firmware versions. Most of these card drivers are async and, therefore, you must manually update the drivers after upgrading the ESXi host.

Resolution

To resolve this issue, upgrade the driver and firmware to the latest version. To identify the correct driver, see Identifying a PCI device in ESX/ESXi 4.x or ESXi 5.x and comparing it with the VMware HCL (1031534).

For more information on the driver compatibility and the latest versions, see the VMware Compatibility Guide.

Note: The firmware version must be verified with the hardware vendor.

Additional Information

For related information, see:

Identifying correct driver for ESXi/ESX host PCI devices (HBA) using VMware Hardware Compatibility Guide (HCL)
Installing async drivers on ESXi 4.x and ESX 4.x
How to download and install async drivers in ESXi 5.x/6.x