NSX-T Edge ECMP behavior
search cancel

NSX-T Edge ECMP behavior

book

Article ID: 312977

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

To provide information on the ECMP behavior.

Symptoms:
When two layers of edge nodes are deployed with T1 running on separate edge clusters from T0 edge clusters, north bound ECMP cannot guarantee to hit all uplinks on the T0 Edge

Environment

VMware NSX-T Data Center
VMware NSX-T Data Center 3.x

Cause

ECMP hashing is based on a 5-tuple algorithm that uses source IP address, destination IP address, source port, destination port, and IP protocol. This method allows a better distribution of the traffic across all the available paths. Due to the randomness of the hashing algorithm this behavior is caused. 

Resolution

None

Workaround:
1. One option is to connect Tier-1 to Tier-0 on the same edge cluster as Tier-0. This way traffic can be marked differently and balanced across all uplinks.
2. Ensure the number of edge nodes in Tier-0 cluster (used by Tier-1's ECMP) is not same as the number of Tier-0 uplinks (used by Tier-0's ECMP). To address this, create an additional uplink for Tier-0 to diversify the usage of Tier-0 uplinks. This ensures a more balanced distribution of traffic across the network, improving overall network efficiency.
 
Issue with the ECMP could be seen in some of the following cases:
 
1. If Tier0 and tier1 are on a different edge cluster.
2. Each tier0 has two uplinks, and 2 edges in the edge cluster and the ECMP to north external router is via uplink1 and uplink2 for edge node 1 and via uplink 3, uplink 4 for edge node 2 - which makes 2-way ECMP per edge node. At the same time, the Tier-1 is also having a 2-way ECMP to both edge nodes.
3. Tier-1 is connected to Tier-0 edge cluster which only has two edge nodes, so the ECMP from Tier-1 to Tier-0 is via Tier-0 Edge A and Tier-0 Edge B, which is 2-way ECMP.
 
To correct the above configuration, make either of the following topology changes below:
  • Add one more edge node to tier0 SR cluster, to mitigate the ECMP issue with the Tier-1.
  • Add one more uplink to tier0 to make tier0’s ECMP 3-way.


Additional Information

Impact/Risks:
The issue faced is in multi-Tier-0 topology only i.e., when two layers of edge nodes are deployed with T1 running on separate edge clusters from T0 edge clusters, north bound ECMP cannot guarantee to hit all uplinks on the T0 Edge.