Some frames smaller than standard ethernet frame 64 Bytes, may be dropped at the physical layer (ESXi pNIC/Physical Switches) due to the checksum offload issue
book
Article ID: 336805
calendar_today
Updated On:
Products
VMware NSX
Issue/Introduction
To inform about a fix for the issue
Symptoms:
Intermittent packet drop
Application slowness
Common protocols like RDP/SSH may be impacted
Not all the pNICs/Physical switches drop the packets due to the checksum offload issue
The user would not likely see the issue within the data center, as the smaller packets are usually GENEVE encapsulated while traversing the physical medium
The issue manifests when the smaller packets are padded inaccurately and handed over to a VM to forward the packet. One such example is when HCX forwards the packet with non-zero padding
Perform a packet capture at the vNIC interface of the HCX/Any packet forwarder VM and the relevant pNIC, and search for the usual small packets like ACK, RST, etc.
The ingress packet on pNIC would show the smaller packet, The ingress packet to the VM would show that the packet is padded with non-zero bytes
Steps to reproduce the issue Create the following topology
VM-A on Host-A ---- NSX-T Edge on Host-B ---- VM-B(Any VM or HCX-NE) on Host-C
Configure VM-A as ncat client and VM-B as ncat server
Send some characters/packets from VM-A to VM-B
On the Host-C where VM-B resides, Capture packets at VMNIC and VNIC.
Review the captures and you will see some smaller packets that have gotten non-zero padding at the vNIC.
Environment
VMware NSX-T Data Center 3.x VMware NSX-T Data Center
Cause
Smaller frames that are smaller than standard ethernet frame requires padding while traversing the Network. During the padding operation on ESXi, vmk_PktPadFrame a padding operation on ESXi is not working as expected when frameMappedLen > frameLen. It creates new sg element, that causes non-zero padding of the smaller ethernet frame. This in turn causes the checksum offload issues at pNic/at physical switches and as an end result, the packets/frames would be dropped.
Resolution
Upgrade to NSX-T 3.2.2.0.1 HP, 3.2.3, or NSX 4.1.0.0
Workaround: NA
Additional Information
Impact/Risks:
Application slowness and sluggishness while using RDP/SSH