Below symptoms are observed on ESXi hosts.
xxxx-xx-xxTxx:xx:xx.xxxZ cpu24:xxxxxxx)ixgben: ixgben_CheckTxHang:xxxx: vmnicX: false hang detected on TX queue 2
ESXi 7.x
ESXi 8.x
Root cause of the issue is that there is a bug in Intel’s ixgben driver for queue utilization triggering TX false hang in vmkernel logs.
"TX false hang" indicates a possible interrupt loss which may impact the traffic performance through physical NICs.
This happens if the physical NICs carrying traffic have a higher number of TX queues than RX queues.
Intel fixed the issue in the async driver version ixgben-1.22.1.0, and is available for ESXi 8.0 +
WA to mitigate the issue in ESXi 7.0 or for driver version less than ixgben-1.22.1.0 is as below,
$ esxcli system module parameters list -m ixgben
Take note of any parameter already set in order to verify them after any change.
$ esxcli system module parameters set -p DevRSS=<value_list> -m ixgben
$ reboot
or
$ esxcli system module parameters set -a -p DevRSS=<value_list> -m ixgben
$ reboot
"-a" to append the new parameters and leave the other parameters set.
The "value_list" for DevRSS is typically a comma-separated list of values, where each value corresponds to a specific physical NIC handled by that module.
A value of 1 means Enable RSS for that specific physical port.
A value of 0 means Disable RSS for that specific physical port.
For ex., esxcli system module parameters set -p DevRSS=1,1,1,1 -m ixgben enables DevRSS on all 4 physical NICs of ESXi host.
NOTE:
DevRSS conflicts with RSS and DRSS, remove any RSS and DRSS configurations.
From the output of the command vsish -e get /net/pNics/vmnicX/stats on affected ESXi hosts, check the RX queues and TX Queues for which the "rxPkts" and "txPkts" are incrementing respectively. Check if the number of TX queues being used are more than RX queues.