The ESXi host fails to boot because the PCI device is not detected, resulting in a PSOD with the following backtrace.
#0 MMIORead32 (addr=#####) at bora/public/vm_basic_asm_x86_common.h:###
#1 MSIXDoMaskVector (index=<optimized out>, vector=###, dev=##############) at bora/vmkernel/hardware/pci/msix.c:###
#2 MSIXMaskVector (vector=###, icData=<optimized out>) at bora/vmkernel/hardware/pci/msix.c:###
#3 ################## in IntrCookieMaskInt (intrCookieInfo=intrCookieInfo@entry=############, vector=vector@entry=###) at bora/vmkernel/main/intrCookie.c:####
#4 ################## in IntrCookieMaskInt (vector=###, intrCookieInfo=<optimized out>) at bora/vmkernel/main/intrCookie.c:####
#5 IntrCookie_AddHandlerWithModID (cookie=cookie@entry=##, moduleID=moduleID@entry=##, acknowledge=<optimized out>, h=<optimized out>, data=#############, sharable=0 '\000', sharable@entry=1 '\001', name=############# "0000:17:00.0-roce-0", flags=0) at bora/vmkernel/main/intrCookie.c:###
#6 ################## in vmk_IntrRegister (moduleID=##, intrCookie=64, props=props@entry=##################) at bora/vmkernel/hardware/vmkapi/vmkapi_intr.c:###
#7 ################## in bnxtroce_register_irq (bp=bp@entry==##################), irq_idx=irq_idx@entry=0) at vmkdrivers/native/BSD/Network/bnxt/bnxtroce/bnxtroce_main.c:912
#8 ################## in bnxtroce_enable_fw_chnl (bp=bp@entry==##################)) at vmkdrivers/native/BSD/Network/bnxt/bnxtroce/bnxtroce_main.c:####
#9 ################## in bnxtroce_hw_init (bp=bp@entry==##################)) at vmkdrivers/native/BSD/Network/bnxt/bnxtroce/bnxtroce_main.c:####
#10 ################## in bnxtroce_attach_device (device=##################) at vmkdrivers/native/BSD/Network/bnxt/bnxtroce/bnxtroce_drv.c:####
#11 ################## in Driver_AnnounceDevice (device=##################, modID=modID@entry=##, freezeCookie=0x0, driver=driver@entry=##################) at bora/vmkernel/core/driver.c:386
#12 ################## in DeviceAttach (modID=##, dev=##################) at bora/vmkernel/core/device_layer.c:####
#13 DeviceBind (handle=##################, modID=##, attachOnly=<optimized out>) at bora/vmkernel/core/device_layer.c:####
#14 ################## in DeviceVSIBind (instArgs=<optimized out>, inputArgs=##################, attachOnly=<optimized out>) at bora/vmkernel/core/device_layer.c:####
Please note that the above backtrace is provided only as an example, and the environment variables may vary.
VMware ESXi 7.x
The memory dump shows that the PSOD occurred due to an incorrect interCookie shared with the bnxtnet driver. Attaching the bnxtroce driver caused the PSOD because of an inconsistency and version mismatch between the bnxtnet and bnxtroce drivers.
This PSOD has also been observed on a specific server model, the Nutanix NX-8170-G9.
bnxtnet 226.0.121.0-1OEM.700.1.0.15843807
bnxtroce 216.0.58.0-23vmw.703.0.50.20036589Update the bnxtnet and bnxtroce drivers to the same version, as specified in the Compatibility Guide.