L2 loop causes invalid MAC learning on NSX-T Edge Node when NPAR TX switching and L2VPN is enabled
search cancel

L2 loop causes invalid MAC learning on NSX-T Edge Node when NPAR TX switching and L2VPN is enabled

book

Article ID: 322645

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

The purpose of this article is to inform the user how to avoid L2 loop when configuring L2VPN and NPAR TX switching.
If you plan to bridge your traffic, make sure either NPAR is disabled, or TX switching is disabled otherwise the traffic will enter in a loop condition due to the way the traffic flow is handled on the NetworkIOChain.

Symptoms:
  • You have configured a L2VPN for network extension (L2 bridge) via NSX-T Edge Node
  • You are experiencing disruption of traffic due to L2 loop.
  • You may encounter Frequent MAC address flaps on the NSX-T Edge Node.


Environment

VMware NSX-T Data Center 4.x
VMware NSX-T Data Center 3.x
VMware NSX-T Data Center

Cause

When NPAR TX switching is enabled, an ARP packet can be received back on the same ESXi vmnic it egressed. This is because an ARP sent from a netqueue can be received on the default queue of the same (physical function) PF, and on other NPAR PFs on same physical port (vmnic). This is expected behavior in NPAR configuration with tx-switching enabled.
When L2 Bridging or L2VPN is configured in addition to NPAR TX switching, it can lead to a loop due to the "reflected" ARP travelling back through the GRE - IPSEC tunnel to the NSX-T Edge Node where the source VM is connected. The ARP is received on the lswitch port connected to the remote side of L2VPN and the mac is relearned on the wrong lswitch port.

Resolution

Disable NPAR TX switching
 
Run the below commands as root user on the ESXi host where the NSX-T Edge Node (L2 bridge) resides:
1. esxcfg-module -s ‘npar_tx_switching=0’ qedentv
2. reboot
Note: The command given above is only for NICs using the qedentv (Marvell) driver, Please contact your NIC/server hardware vendor for any similar behavior,commands, settings or more info for any other NIC or driver.

Note: Setting npar_tx_switching module param to 0 can potentially lead to a PSOD on ESXi if qedentv driver version is lower than 3.40.57.0 due to a known issue. It has been fixed in later releases of the driver.

Workaround:
There is no workaround applicable for this issue.