Container to Container VXLAN communication (TAS) getting dropped for TCP Checksum errors
search cancel

Container to Container VXLAN communication (TAS) getting dropped for TCP Checksum errors

book

Article ID: 423328

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • C2C networking not working for Tanzu Application Services (TAS).
  • The apps Containers are running on Ubuntu Diego-Cell VMs.
  • VM to VM VXLAN traffic is using standard VXLAN port 4789.
  • ESXi hosts are prepared for NSX.
  • Diego-cell VMs are connected NSX Overlay segments.
  • From packet-captures taken at the Containers, it can be seen that packets being dropped by the Container due to TCP CSUM errors.
  • Applying the workaround as mentioned in the following 2 KBs resolves the problem:  KB 298181 and KB 324199

Environment

  • VMware NSX 3.x / 4.x / VCF 9
  • VMware ESXi server 7.x/ 8.x
  • Tanzu Application Services (TAS)

Cause

NSX overlay does not support inner VXLAN offload + Geneve encap.

VXLAN over Geneve encapsulation with inner offloads is not supported. It is important to note that the C2C communication will work with NSX VLAN segments (even with inner offloads), but not with NSX overlay segments. This is because for NSX VLAN segments, there is no Geneve encapsulation. 

If the source and destination Diego-Cell VMs both are on the same ESXi host then C2C communication will work too (even with inner offloads), as there is no Geneve encapsulation involved. This may contribute to the intermittency of the problem, as containers and VMs may move. 

Resolution

Any of the below workarounds can be followed:

i.  Continue with the workarounds provided in KB 298181

ii. Don't connect the guest VMs to NSX overlay segments. You may connect them to vsphere dvpg (or NSX VLAN segments).

iii. Use Antrea CNI. NSX overlay segments support inner Geneve offload + Geneve encap.