The following symptoms may be observed:
/var/run/log/vmkernel* including but not limited to:nmlx5_CoreAccessReg: command failed: TimeoutFailed accessing MTMP registerDevice's health is compromised: PCI COMM errorDevice internal error state is setIO was abortedVMware vSphere ESXi
This issue is caused by a hardware-level failure affecting communication between the ESXi host and the network adapter over the PCIe bus.
A PCI COMM error indicates that the adapter has entered an internal error state and is no longer able to respond to PCIe transactions.
Once this condition occurs, the nmlx5 driver cannot successfully access device registers or recover normal adapter operation.
Common causes include:
esxcli network nic get -n vmnicX"To validate the driver/firmware version of the network adapters, please refer article: Determining Network/Storage firmware and driver version in ESXi