ESXi host fails with PSOD "Fatal error" with Mellanox ConnectX-5 network card
search cancel

ESXi host fails with PSOD "Fatal error" with Mellanox ConnectX-5 network card

book

Article ID: 389317

calendar_today

Updated On: 04-21-2025

Products

VMware vSphere ESXi VMware vSphere ESXi 7.0

Issue/Introduction

Symptoms:

  • Mellanox ConnectX-5 network card with "nmlx5_core" driver is installed on ESXi.
  • ESXi host fails with PSOD with "Fatal error"
  • The scenarios above may lead to PSOD with the following information:

--IPMI log--
# esxcli hardware ipmi sel list -p -i -n all
   Record Id: XXX
   When: YYYY-MM-DDTHH:MM:SS
   Event Type: 111 (Unknown)
   SEL Type: 2 (System Event)
   Message: Assert + Critical Interrupt Bus Fatal Error
   Sensor Number: 0

--PSOD backtrace:

 ApeiHEST: 373: Fatal error from ####:##:##.#(PCI Express Endpoint), VID:15b3, DID:1017, DevSts: 0xd, AERUeSts: 0x#####.

The PCI address "####:##:##.#" relates to vmnic2 (Mellanox ConnectX-5 network card)

# lspci
####:##:##.# Network controller Ethernet controller: Mellanox Technologies ConnectX-5 EN network interface card 10/25GbE dual-port SFP28(MCX512A-ACAT) [vmnic2]

Environment

VMware ESXi 7.0

Cause

A PCI bus error, typically caused by a misbehaving I/O device or an electrical glitch.    
It's a type of hardware issue. 

Resolution

Suggest to contact hardware vendor for further diagnosis.