ESXi PSOD caused by IOMMU Fault in bnxtnet driver
search cancel

ESXi PSOD caused by IOMMU Fault in bnxtnet driver

book

Article ID: 436908

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • VMware ESXi experiences a Purple Screen of Death (PSOD) triggered by an IOMMU Fault involving the bnxtnet driver.
  • The /var/run/log/vmkernel.log indicates that the hardware device (Broadcom BCM57414 NetXtreme-E) attempted a Direct Memory Access (DMA) operation that was blocked by the system's Input-Output Memory Management Unit (IOMMU).
  • The /var/run/log/vmkernel.log records the following errors leading up to the kernel panic:

YYYY-MM-DDThh:mm:ss.###Z Wa(###) vmkwarning: cpu#:######)WARNING: VTD: ###:  IOMMU Fault IOMMU Unit #2: R/W=W, Device ####:##:##.# Addr = 0x########
YYYY-MM-DDThh:mm:ss.###Z Wa(###) vmkwarning: cpu#:######)WARNING: VTD: ###:  Reason = 0x# -> DMA outside of guest address width.
YYYY-MM-DDThh:mm:ss.###Z Wa(###) vmkwarning: cpu#:######)WARNING: VTD: ###:  DMAR Fault IOMMU Unit #2: R/W=W Device ####:##:##.# Addr = 0x########
YYYY-MM-DDThh:mm:ss.###Z Wa(###) vmkwarning: cpu#:######)WARNING: VTD: ###: Reason = 0x# -> DMA outside of guest address width.
YYYY-MM-DDThh:mm:ss.###Z In(###) vmkernel: cpu#:######)IOMMU: ####: ####:##:##.#: IOMMU Fault detected. IOaddr: 0x######## Mask: 0x#
YYYY-MM-DDThh:mm:ss.###Z Wa(###) vmkwarning: cpu#:######)WARNING: IOMMU: ####: ####:##:##.#: IOMMU Fault detected for (vmnic#/bnxtnet) IOaddr: 0x######## Mask: 0x# Domain: 0x########.
...
YYYY-MM-DDThh:mm:ss.###Z Wa(###) vmkwarning: cpu#:######)WARNING: World: vm ######: ####: vmm#:<vm_name>:vmk: vcpu-#:Invalid pframe type for non-zero pinCount
YYYY-MM-DDThh:mm:ss.###Z Wa(###) vmkwarning: cpu#:######)WARNING: World: vm ######: ####: Simultaneous panic! :vmk: vcpu-#:Invalid pframe type for non-zero pinCount

Environment

  • VMware ESXi 8.x
  • Broadcom BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (bnxtnet driver)

Cause

  • The kernel panic is caused by a Direct Memory Access (DMA) operation violation by the bnxtnet driver.
  • The Input-Output Memory Management Unit (IOMMU) blocks an illegal memory access request from the NIC hardware attempting to address memory regions not permitted by the IOMMU protection domain, triggering a Non-Maskable Interrupt (NMI).

Resolution

  1. Place the affected ESXi host in Maintenance Mode.
  2. Install the latest bnxtnet driver version.
  3. Apply the latest firmware update for the Broadcom BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller following the hardware vendor's specific utility instructions.
  4. Reboot the ESXi host if required.
  5. Exit the ESXi host from maintenance mode.
  6. If the issue persists and the PSOD re-occurs, contact the respective server hardware vendor for further assistance.

Additional Information

Broadcom BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller compatibility guide

Update Broadcom bnxtnet Driver/Firmware

Download and install async drivers in VMware ESXi