Intermittent Packet loss on VXLAN network in NSX
search cancel

Intermittent Packet loss on VXLAN network in NSX

book

Article ID: 339161

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
  • Unknown unicast flooding in one or more VXLANs leading to poor performance.
  • Intermittent packet loss on VXLAN networks.
  • Running the net-vdl2 -l command displays 1024 MAC Addresses learned on a problematic VNI.
VXLAN network: 5000
                        Multicast IP: N/A (headend replication)
                        Control plane: Enabled (multicast proxy,ARP proxy)
                        Controller: 10.0.0.1 (up)
                        MAC entry count: 1024 -- MAX Mac limit of 1024 Reached.
                        ARP entry count: 0
                        Port count: 2
  • The VDL2 counters for the VNI displays the "mac.update.full" counter incrementing in value.
net-vdl2 -S -s TENANT_VTEP -n 8007
mac.lookup.flood: 58193648
mac.lookup.full: 3356910
mac.update.learn: 0
mac.update.extend: 0
mac.update.full: 439420396  <---------This counter will continue to Increase.


Environment

VMware NSX for vSphere 6.3.x
VMware NSX for vSphere 6.2.x

Cause

This issue occurred because earlier versions of NSX for vSphere had a maximum MAC address limit of 1024 per VNI/Host. Once this limit was exceeded, the traffic for the Destination MAC Addresses was flooded across the VNI leading to poor performance.

Note: Mac address learning occurs when there is any broadcast/unknown unicast or Multicast traffic received on a given VNI on an ESXi host, even though the virtual machines on the ESXi host are not communicating. The maximum limit of 1024 MAC Addresses is per VNI/Host.

As part of the DLR ARP resolution process, the DLR uses an ARP proxy mechanism that may result in ESXi hosts learning the DLR pMAC addresses of other prepared ESXi hosts in the Transport Zone.
These MAC addresses are stored in the same MAC table that is used for VMs and therefore the maximum number of VM MAC addresses per host per VNI will actually be less than 1024.
This can explain why the MAC entry count may be at 1024 but the number of VM MAC addresses is lower.
The DLR ARP Suppression feature can reduce DLR ARP traffic. By reducing the number of DLR pMAC addresses learnt by a host, it can allow more VM MAC addresses to be learnt.
DLR ARP Suppression is fully functional on NSX 6.4.0. For earlier releases of NSX see Workaround section.

Resolution

NSX for vSphere versions 6.4.0/6.3.6 and above versions increased the maximum MAC count per VNI per host from 1024 to 2048.

Workaround:
To work around this issue, segment VXLAN networks with limit of 1024 MAC entries per VXLAN segment.
The DLR ARP Suppression workaround may also be considered, For more details, see DLR ARP traffic is not suppressed as expected in NSX.