In NSX/vDefend environments where SCRX/Turbo mode is enabled, traffic inspected by L7 or Distributed IDS/IPS may experience disruption if a vmxnet3 transmit queue on the infravisor pod enters a stopped state.
Symptoms:
On an affected ESXi host, one or more of the following symptoms may be observed:
Validation:
net-stats -l | grep infravisor-pod[root@ESX:]net-stats -l | grep infravisor-pod100663326 5 9 DvsPortset-0 00:0c:29:d3:22:fb infravisor-pod.eth0
In this example:
Portset: DvsPortset-0
Port ID: 100663326
Interface: infravisor-pod.eth0
<DvsPortset-X>,<Port ID>,<Tx_queue_number> .Tx_queue_number value can range from 0 to 3.vsish -e get /net/portsets/<DvsPortset-X>/ports/<Port ID>/vmxnet3/txqueues/<Tx_queue_number>/statusExample:
[root@ESX:] vsish -e get /net/portsets/DvsPortset-0/ports/100663326/vmxnet3/txqueues/0/status
status of a vmxnet3 vNIC tx queue {
intr index:1
stopped:1 <<<<<< This is the key indicator of the issue
error code:2147483655
next2Tx:1895
next2Comp:1895
ring size:2048
data ring desc size:128
ts ring desc size:0
genCount:0
next2Write:1894
next2Tx from timeout:65535
next2Comp from timeout:65535
timestamp in milliseconds in check:0
}This issue applies to environments with:
This issue does not apply to Classic/VDPI deployments.
The issue is triggered by inconsistent packet metadata provided by the guest vNIC driver. Specifically, for an offload packet, the reported header length may be larger than the total packet data length.
Example:
hlen=256
data_len=250This issue is resolved in future releases of vDefend/NSX. Upgrade to a fixed release when it becomes available or reach out to Broadcom support
If an immediate upgrade is not possible, restart the NSX-SCX service on the affected ESXi host.
/etc/init.d/nsx-scx-###### restartReplace nsx-scx-###### with the applicable NSX-SCX service name on the affected host.
Example:
ls /etc/init.d/ | grep nsx-scx/etc/init.d/<nsx-scx-BuildNumber> restartNote: This is a temporary workaround. Restarting the NSX-SCX service resets the vmxnet3 queue state and may restore connectivity, but the queue may stop again if another packet with invalid metadata is encountered.