ESXi Host Fails to Boot with PSOD Due to PCI Device Not Found with Backtrace [MSIXDoMaskVector, MSIXMaskVector, IntrCookieMaskInt, IntrCookieMaskInt, IntrCookie_AddHandlerWithModID]
search cancel

ESXi Host Fails to Boot with PSOD Due to PCI Device Not Found with Backtrace [MSIXDoMaskVector, MSIXMaskVector, IntrCookieMaskInt, IntrCookieMaskInt, IntrCookie_AddHandlerWithModID]

book

Article ID: 391586

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

The ESXi host fails to boot because the PCI device is not detected, resulting in a PSOD with the following backtrace.

#0  MMIORead32 (addr=#####) at bora/public/vm_basic_asm_x86_common.h:###
#1  MSIXDoMaskVector (index=<optimized out>, vector=###, dev=##############) at bora/vmkernel/hardware/pci/msix.c:###
#2  MSIXMaskVector (vector=###, icData=<optimized out>) at bora/vmkernel/hardware/pci/msix.c:###
#3  ################## in IntrCookieMaskInt (intrCookieInfo=intrCookieInfo@entry=############, vector=vector@entry=###) at bora/vmkernel/main/intrCookie.c:####
#4  ################## in IntrCookieMaskInt (vector=###, intrCookieInfo=<optimized out>) at bora/vmkernel/main/intrCookie.c:####
#5  IntrCookie_AddHandlerWithModID (cookie=cookie@entry=##, moduleID=moduleID@entry=##, acknowledge=<optimized out>, h=<optimized out>, data=#############, sharable=0 '\000', sharable@entry=1 '\001', name=############# "0000:17:00.0-roce-0", flags=0) at bora/vmkernel/main/intrCookie.c:###
#6  ################## in vmk_IntrRegister (moduleID=##, intrCookie=64, props=props@entry=##################) at bora/vmkernel/hardware/vmkapi/vmkapi_intr.c:###
#7  ################## in bnxtroce_register_irq (bp=bp@entry==##################), irq_idx=irq_idx@entry=0) at vmkdrivers/native/BSD/Network/bnxt/bnxtroce/bnxtroce_main.c:912
#8  ################## in bnxtroce_enable_fw_chnl (bp=bp@entry==##################)) at vmkdrivers/native/BSD/Network/bnxt/bnxtroce/bnxtroce_main.c:####
#9  ################## in bnxtroce_hw_init (bp=bp@entry==##################)) at vmkdrivers/native/BSD/Network/bnxt/bnxtroce/bnxtroce_main.c:####
#10 ################## in bnxtroce_attach_device (device=##################) at vmkdrivers/native/BSD/Network/bnxt/bnxtroce/bnxtroce_drv.c:####
#11 ################## in Driver_AnnounceDevice (device=##################, modID=modID@entry=##, freezeCookie=0x0, driver=driver@entry=##################) at bora/vmkernel/core/driver.c:386
#12 ################## in DeviceAttach (modID=##, dev=##################) at bora/vmkernel/core/device_layer.c:####
#13 DeviceBind (handle=##################, modID=##, attachOnly=<optimized out>) at bora/vmkernel/core/device_layer.c:####
#14 ################## in DeviceVSIBind (instArgs=<optimized out>, inputArgs=##################, attachOnly=<optimized out>) at bora/vmkernel/core/device_layer.c:####

 

Please note that the above backtrace is provided only as an example, and the environment variables may vary.

Environment

VMware ESXi 7.x

Cause

The memory dump shows that the PSOD occurred due to an incorrect interCookie shared with the bnxtnet driver. Attaching the bnxtroce driver caused the PSOD because of an inconsistency and version mismatch between the bnxtnet and bnxtroce drivers.
This PSOD has also been observed on a specific server model, the Nutanix NX-8170-G9.

bnxtnet  226.0.121.0-1OEM.700.1.0.15843807
bnxtroce 216.0.58.0-23vmw.703.0.50.20036589

Resolution

Update the bnxtnet and bnxtroce drivers to the same version, as specified in the Compatibility Guide.