ESXi host hangs abruptly when using Cisco UCS VIC
search cancel

ESXi host hangs abruptly when using Cisco UCS VIC

book

Article ID: 306379

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • ESXi host hangs abruptly when using Cisco UCS Virtual Interface Card (VIC).
  • ESXi host fails to respond to ping requests.
  • DCUI login works, but does not respond to commands.
  • Unable to see any hardware faults in the UCS Manager.
  • In the /var/run/log/vmkernel.log file, there will error messages similar to:

yyyy-mm-ddTHH:MM:SS.SSSS cpu40:2097412)WARNING: nenic: _vnic_dev_cmd2:331: 0000:62:00.0:Timed out devcmd 4
yyyy-mm-ddTHH:MM:SS.SSSS cpu44:2097412)WARNING: nenic: _vnic_dev_cmd2:331: 0000:62:00.1:Timed out devcmd 4
yyyy-mm-ddTHH:MM:SS.SSSS cpu44:2097412)WARNING: nenic: _vnic_dev_cmd2:331: 0000:68:00.3:Timed out devcmd 4
yyyy-mm-ddTHH:MM:SS.SSSS cpu40:2097412)WARNING: nenic: _vnic_dev_cmd2:331: 0000:68:00.4:Timed out devcmd 4
yyyy-mm-ddTHH:MM:SS.SSSS cpu40:2097412)WARNING: nenic: _vnic_dev_cmd2:331: 0000:68:00.5:Timed out devcmd 4
yyyy-mm-ddTHH:MM:SS.SSSS cpu40:2097412)WARNING: nenic: _vnic_dev_cmd2:331: 0000:68:00.6:Timed out devcmd 4
yyyy-mm-ddTHH:MM:SS.SSSS cpu40:2097412)WARNING: nenic: _vnic_dev_cmd2:331: 0000:62:00.2:Timed out devcmd 4
yyyy-mm-ddTHH:MM:SS.SSSS cpu40:2097412)WARNING: nenic: _vnic_dev_cmd2:331: 0000:62:00.3:Timed out devcmd 4
yyyy-mm-ddTHH:MM:SS.SSSS cpu40:2097412)WARNING: nenic: _vnic_dev_cmd2:331: 0000:62:00.4:Timed out devcmd 4
...
...
yyyy-mm-ddTHH:MM:SS.SSSS cpu24:2097238)WARNING: Uplink: 21913: Queue 0 of device vmnic2 stuck, resetting the device
yyyy-mm-ddTHH:MM:SS.SSSS cpu24:2097238)WARNING: Uplink: 21913: Queue 0 of device vmnic3 stuck, resetting the device
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)nenic: enic_uplink_reset:3440: [0000:62:00.3] Resetting
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)WARNING: nenic: _vnic_dev_cmd2:294: 0000:62:00.3: Fatal error while issuing devcmd2 command 36, hardware surprise removal
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)WARNING: nenic: _vnic_dev_cmd2:294: 0000:62:00.3: Fatal error while issuing devcmd2 command 8, hardware surprise removal
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)nenic: enic_quiesce_dev:902: [0000:62:00.3] enter
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)nenic: enic_quiesce_dev:920: [0000:62:00.3] disable device
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)WARNING: nenic: _vnic_dev_cmd2:294: 0000:62:00.3: Fatal error while issuing devcmd2 command 29, hardware surprise removal
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)WARNING: nenic: enic_uplink_tq_stop:264: [0000:62:00.3] Failed to stop TX queue 0. Already stopped
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)WARNING: nenic: _vnic_dev_cmd2:294: 0000:62:00.3: Fatal error while issuing devcmd2 command 13, hardware surprise removal
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)WARNING: nenic: vnic_dev_del_addr:1053: Failed to del addr [00:25:b5:b5:00:2e], -195887120
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)WARNING: nenic: vnic_wq_disable:239: Failed to disable WQ[0]
yyyy-mm-ddTHH:MM:SS.SSSS cpu38:2097413)WARNING: nenic: _vnic_dev_cmd2:294: 0000:62:00.3: Fatal error while issuing devcmd2 command 36, hardware surprise removal

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on the environment.



Environment

vSphere ESXi 7.x

Cause

  • vNICs failed to respond.
  • Driver's attempt to revive by resetting the vNIC also failed.
  • As reset failed, vNIC stay in disabled state leading into network communication failure.

Resolution

Engage CISCO hardware vendor for further diagnostics and resolution.