Large number of duplicate acknowledgements (DUP ACK) seen when doing data transfer over NSX L2VPN
search cancel

Large number of duplicate acknowledgements (DUP ACK) seen when doing data transfer over NSX L2VPN

book

Article ID: 392334

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • An NSX L2VPN is used to transfer data, either managed or autonomous edge nodes are used with a VLAN segment.
  • Data transfers over the L2VPN may fail.
  • When doing a packet capture we can see a large number of duplicate acknowledgements and out of order packets.
  • If we do a capture on the traffic entering the uplink side of the L2VPN edge node, we see the packets are in the correct order, but on the downlink side of the L2VPN edge node, we see the packets are in not in the same/correct order.
  • For example the packets come into the uplink in sequence 1, 2, 3, 4, 5, but in the downlink side, we see the packet in sequence 1,2,4,3,5.
  • This leads to the endpoint sending duplicate acknowledgements back to the sender.
  • In the edge node /var/log/syslog, we see the following entries:

edge01 NSX 8648 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="cache" level="DBG"] remove replaced flow cache entry
edge01 NSX 8648 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="cache" level="DBG"] message repeated 16 times: [remove replaced flow cache entry]

Environment

VMware NSX

Cause

This issue occurs when jumbo frames are used (MTU greater than 1500) and DF (Dont Fragment) is set on VLAN segments traversing the L2VPN.
When flow cache is enabled, the packets which have a large MTU, are not getting processed correctly and not entering the fast path, they get processed on the slow path, packets with smaller MTU are going via the fast path, which leads to the packets leaving the edge node in the wrong order and leads to the duplicate acknowledgements.

Resolution

This is a known issue impacting VMware NSX.

To workaround the issue, there are 2 options:

1. Set the MTU to 1350 on both endpoints of the L2VPN, that's one the clients making the connection through the L2VPN, this ensures they are all small packets and will get correctly processed by flow cache.

Or

2. Disable flow cache on the edge node, review the following KB for details on flow cache and how to disable it: NSX-T Edge Flow Cache

Additional Information

If this KB did not help resolve your issue, you can review the following KB for further troubleshooting steps Troubleshooting NSX Edge and Virtual Machine (VM) Performance

If you are contacting Broadcom support about this issue, please provide the following:

  • NSX Manager and L2VPN edge node support bundles
  • Any packet captures done to identify the issue and commands used to do the captures, include IP's addresses and Mac addresses of endpoints.
  • Status of workaround preformed, either 1 or 2 or both if attempted.

Handling Log Bundles for offline review with Broadcom support