This article describes a scenario where a host experiences a loss of network connectivity on specific interfaces (e.g., vmnic#). The network adapter becomes unresponsive, causing the driver to mark the links as "down."
During this event, kernel logs typically indicate that the driver stopped receiving responses from the firmware. Initially, commands used to gather port statistics (HWRM_FUNC_QSTATS) will time out:
####-##-##T##:##:##.###Z Wa(###) vmkwarning: cpu##:#######)WARNING: bnxtnet: hwrm_send_msg:###: [vmnic# : 0x############] HWRM cmd resp_len timeout, cmd_type 0x##(HWRM_FUNC_QSTATS) seq #####
When the driver subsequently attempts to probe the firmware status, it receives a specific error code (0x89021), confirming that the device firmware has crashed
####-##-##T##:##:##.###Z Wa(###) vmkwarning: cpu##:#######)WARNING: bnxtnet: hwrm_get_version:####: [vmnic# : 0x############] VER_GET failed- FW_STATUS_REG: 0x89021
The root cause is identified as a TCAM parity error occurring within the RE CFA (Complex Flow Accelerator) of the network card's firmware. This specific condition is confirmed by firmware trace dumps, which log a "CRT FATAL ERROR" alongside the crash event
####-##-##T##:##:##Z In(###) vmkernel: ####.#:D:Register re_cfa_int_sts_0:0x########: 0x9021####-##-##T##:##:##Z In(###) vmkernel: ####.#:D:CRT FATAL ERROR: 0x9021
This error is typically an intermittent "soft error" caused by environmental factors (such as random alpha particles flipping a bit in the TCAM memory). In rare cases, it can indicate a physical hardware defect if the issue occurs repeatedly.
To restore connectivity, perform the following steps:
Japanese KB: Broadcom NIC ファームウェアのクラッシュによるリンクダウン(エラー 0x89021)