Network connectivity issues when Edge nodes are connected to VxLAN network with MP-BGP EVPN
search cancel

Network connectivity issues when Edge nodes are connected to VxLAN network with MP-BGP EVPN

book

Article ID: 381084

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Edges are using BGP for routing.
  • BGP is established and shows no errors.
  • South-bound traffic is dropped on Tier-0 gateway's Edge Transport Node, interrupting network connectivity.
  • Drop counter (TX Drops) on Tier-0 gateway's uplink interface is increasing:
    <edge-hostname>(tier0_sr[X])> get interfaces stats
    Thu Oct 10 2024 UTC 10:04:27.501
    Logical Router
    UUID        : ########-####-####-####-########8d5d
    VRF         : 2
    LR-ID       : 2
    Name        : SR-<Tier-0 name>
    Type        : SERVICE_ROUTER_TIER0
    Statistics
                  Interface                  Type     RX PKTS   TX PKTS  RX BYTES  TX BYTES  RX Drops  TX Drops
    ...
    ########-####-####-####-########4ae6      lif     1981668   2357444  176450467 148213009      43    412108
  • NSX Edge is unable to determine path for the impacted packet - this can be confirmed with "get path" command, which will only show one interface (Tier-0 SR uplink).
  • This IP address (PIP or VIP) is not learned by BGP/vtep interface on the Edge, causing the (untrusted) traffic to drop.
  • Tier-0 Edge doesn't learn vPC BGW (virtual Port Channel Border Gateway) on it's tep interfaces. This can be confirmed with the following command, run on the Edge Transport Node:
    <edge-hostname>> get tunnel-port 
  • When south-bound packet is encapsulated as VxLAN, fabric may use vPC VIP or PIP as a source address. This IP may not be learned by BGP, causing the traffic to drop on the Edge.

Environment

Cause

  • vPC BGW is using a PIP or a vPC VIP to send traffic to NSX Tier-0 gateway.
  • Due to misconfiguration on the fabric (PIP or VIP), IP address (VIP or PIP) used to send traffic to the Tier-0 is not learned by BGP, hence it's considered not trusted on NSX Edge.
  • Such traffic (from untrusted source IP address) is dropped by the Edge.

Resolution

Resolution requires configuration changes on the fabric level, outside of NSX scope.