Warning: Looks like FW is crashed/non-responsive
search cancel

Warning: Looks like FW is crashed/non-responsive

book

Article ID: 389181

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • NICs on random servers become unresponsive due to NIC firmware crashing
  • An ESXI host is shown in the VC inventory as "Not Responding"
  • An ESXi host is not reachable over the network
  • NICs are using bnxtnet driver/firmware
  • LRO rx aborts are recorded on the vmnics
  • One or more physical network adapters (vmnics) may be missing or disappear from the ESXi host configuration following an upgrade or patch operation. 
  • VOBD message from /var/run/log/vobd.log will confirm the unavailability of working uplinks.
    [esx.problem.net.dvport.redundancy.lost] Lost uplink redundancy on DVPorts ### Physical NIC vmnic0 is down.
    [esx.problem.net.dvport.connectivity.lost] Lost network connectivity on DVPorts ### Physical NIC vmnic1 is down.
  • VMkernel logs /var/run/log/vmkernel.log show entries similar to:

    Note: These may vary based on the network card interface model/driver in use.
    WARNING: bnxtnet: hwrm_send_msg:389: [vmnic0 : 0x452106aee000] HWRM cmd resp_len timeout, cmd_type 0x106(HWRM_CFA_FLOW_STATS) seq 63810
    WARNING: bnxtnet: hwrm_send_msg:389: [vmnic1 : 0x45210a38a000] HWRM cmd resp_len timeout, cmd_type 0x106(HWRM_CFA_FLOW_STATS) seq 15810
    WARNING: bnxtnet: hwrm_send_msg:389: [vmnic0 : 0x452106aee000] HWRM cmd resp_len timeout, cmd_type 0x0(HWRM_VER_GET) seq 63811
    WARNING: bnxtnet: hwrm_get_version:2501: [vmnic0 : 0x452106aee000] VER_GET failed- FW_STATUS_REG: 0x89021
    WARNING: bnxtnet: hwrm_snd_fw_msg:538: [vmnic0 : 0x452106aee000] Looks like FW is crashed/non-responsive.
    WARNING: bnxtnet: hwrm_snd_fw_msg:540: [vmnic0 : 0x452106aee000] Dumping FW trace and reporting link down to OS
    bnxtnet: bnxtnet_report_link_down_to_uplink:1527: [vmnic0 : 0x452106aee000] Reporting Link down
    WARNING: bnxtnet: hwrm_fill_fw_msg:935: [vmnic0 : 0x452106aee000] Sending HWRM message failed
    WARNING: bnxtnet: cmd_cmpl_wait:1164: [vmnic0 : 0x452106aee000] FW went bad, stop waiting for queue flush
    WARNING: bnxtnet: hwrm_send_msg:389: [vmnic1 : 0x45210a38a000] HWRM cmd resp_len timeout, cmd_type 0x0(HWRM_VER_GET) seq 15811
    WARNING: bnxtnet: hwrm_get_version:2501: [vmnic1 : 0x45210a38a000] VER_GET failed- FW_STATUS_REG: 0x89021
    WARNING: bnxtnet: hwrm_snd_fw_msg:538: [vmnic1 : 0x45210a38a000] Looks like FW is crashed/non-responsive.
    WARNING: bnxtnet: hwrm_snd_fw_msg:540: [vmnic1 : 0x45210a38a000] Dumping FW trace and reporting link down to OS
    bnxtnet: bnxtnet_report_link_down_to_uplink:1527: [vmnic1 : 0x45210a38a000] Reporting Link down
    WARNING: bnxtnet: hwrm_fill_fw_msg:935: [vmnic1 : 0x45210a38a000] Sending HWRM message failed

 

Environment

VMware vSphere 7.x
VMware vSphere 8.x
VMware vSphere 9.x

Cause

A firmware crash on the Network Interface Card (NIC) causes the physical adapter to fail. If the network interface card fails, the ESXi host may experience a temporary halt in network traffic. In such cases, the management VMkernel interface could be left without any operational uplinks, depending on the setup, which may result in an outage and host isolation.

 

Resolution

  1. Reboot the affected ESXi host to clear the firmware crash to restore the server to a normal operational state.
  2. Engage the hardware vendor to conduct further hardware-level diagnostics and root cause investigation related to the network interface card failure.
  3. Validate the NIC firmware and driver combination against the VMware Compatibility Guide (VCG) and update to the latest supported releases to ensure environmental stability. Please use Hardware Compatibility Guide to understand the supportability and availability of firmware/drivers for the network interface card.

Additional Information