When this issue occurs the following behavior may be seen:
esxtop > enter > letter "n". In the example below, the vmkernel adapter vmk2 is responsible for vMotion and is currently using vmnic3:VMware vSphere ESXi
This issue occurs due to a firmware-driven NIC reset triggered by the driver's detection of a transmit (TX) hang. High-throughput operations, such as vMotion, can expose timing sensitivities or buffer management defects in specific NIC driver/firmware combinations, leading to a temporary hardware stall that forces a reset.
Confirmation of the issue can be found in the ESXi hosts by reviewing the hosts vmkernel log to confirm if there are any TX hangs reported by the NIC driver. Please note the below is an example and log outputs may be different.
cd /var/log and hit enter to access the directorycat vmkernel.log | less- i to ignore case sensitivity/tx hang and hit enter to verify if there are any reported TX hang issues. If there are, the output will look similar to the example below (please note that the driver type and PCI identifier, in this case is ixgben 0000:##:00.0, may differ from case to case based on the hardware in use):As the NIC driver is the device reporting the TX hang issues, it is recommended to work with the driver/firmware vendor to investigate the issue further. This recommendation is due to the NIC driver not being within the VMware by Broadcom supportability.
It is also recommended that the environment uses the most up to date versions for driver/firmware from the compatibility guide, which contains vendor tested versions.
For more details on checking driver/firmware versions, please refer to Determining Network/Storage firmware and driver version in ESXi and VMware by Broadcom Compatibility Guide for more details.