Network connectivity issues when Edge nodes are connected to VxLAN network with MP-BGP EVPN
search cancel

Network connectivity issues when Edge nodes are connected to VxLAN network with MP-BGP EVPN

book

Article ID: 381084

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Edges are configured as EVPN peers with Cisco N9k switches acting as border gateways. 
  • Edges are in multi-homed configuration (each Edge is peered with each border gateway. 
  • NSX Edges are also using BGP for routing.
  • BGP is established and shows no errors.
  • South-bound traffic is dropped on Tier-0 gateway's Edge Transport Node, interrupting network connectivity.
  • Drop counter (TX Drops) on Tier-0 gateway's uplink interface is increasing:
    <edge-hostname>(tier0_sr[X])> get interfaces stats
    Thu Oct 10 2024 UTC 10:04:27.501
    Logical Router
    UUID        : ########-####-####-####-########8d5d
    VRF         : 2
    LR-ID       : 2
    Name        : SR-<Tier-0 name>
    Type        : SERVICE_ROUTER_TIER0
    Statistics
                  Interface                  Type     RX PKTS   TX PKTS  RX BYTES  TX BYTES  RX Drops  TX Drops
    ...
    ########-####-####-####-########4ae6      lif     1981668   2357444  176450467 148213009      43    412108
     
  • NSX Edge is unable to determine path for the affected traffic - this can be confirmed with "get path" command, which will only show one interface (Tier-0 SR uplink).
  • The source IP address (PIP or VIP) is not learned by BGP/vtep interface on the Edge, causing the (untrusted) traffic to drop.
  • Tier-0 Edge doesn't learn vPC BGW (virtual Port Channel Border Gateway) on it's tep interfaces. This can be confirmed with the following command, run on the Edge Transport Node:
    <edge-hostname>> get tunnel-port 
  • When south-bound packet is encapsulated as VxLAN, fabric may use vPC VIP or PIP as a source address. This IP may not be learned by BGP, causing the traffic to drop on the Edge.

Environment

Cause

  • vPC BGW is using a PIP or a vPC VIP to send traffic to NSX Tier-0 gateway.
  • Due to misconfiguration on the fabric (PIP or VIP), IP address (VIP or PIP) used to send traffic to the Tier-0 is not learned by BGP, hence it's considered not trusted on NSX Edge.
  • Such traffic (from untrusted source IP address) is dropped by the Edge.

Resolution

Resolution requires configuration changes on the fabric level, outside of NSX scope. 

To eliminate packet loss and issues where network traffic is being blackholed, it is recommended to use Edge nodes in single homed setup (instead of multi-homed).