BFD and BGP Flaps on NSX Edge due to Underlay Packet Loss
search cancel

BFD and BGP Flaps on NSX Edge due to Underlay Packet Loss

book

Article ID: 435173

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • BFD sessions between NSX Edge and physical routers (e.g., Juniper, Cisco) flap intermittently.
  • BGP sessions established over these BFD sessions also flap, showing reasons such as BFD down received or Hold Timer Expired.
  • Edge syslog or FRR logs (/var/log/frr/frr.log) report Control Detection Time Expired.
  • Physical router logs indicate a failure to receive keepalives from the peer (e.g., Local reason: NbrSignalRemote reason: CtlExpire).

Environment

  • VMware NSX

Cause

This issue typically occurs when Bidirectional Forwarding Detection (BFD) control packets are dropped or significantly delayed in the physical underlay network. When the NSX Edge does not receive the expected number of BFD packets within the configured detection window, it declares the session down to protect the data path, which in turn triggers a BGP reset.

Analysis often reveals high packet loss (e.g., 90-100%) during the flap period, while Edge node statistics show zero rx_drop or tx_error counters, indicating the issue resides in the physical transport rather than the NSX Datapath.

Resolution

  1. Validate Underlay Connectivity
    • Perform a ping test between the Edge Uplink IP and the physical router peer IP during the event.
    • Use a large packet size to simulate real traffic and check for loss.
    • From the Edge CLI run the command get bfd-sessionsping <Peer_IP> count 50 size 1472
    • If significant packet loss is observed but the Edge shows no internal interface drops, the issue is in the physical network.
  2. Check for Resource Contention
    • Verify if the Edge VM is experiencing CPU contention or if vMotion events coincide with the flaps. vMotion can cause brief disruptions that exceed aggressive BFD timers (e.g., < 300ms).
  3. Analyze BFD Statistics
    • Check the BFD session statistics on the Edge to see if the Rx_packets or Tx_packets are incrementing as expected.
      • get bfd-session local-ip <Local_IP> remote-ip <Remote_IP>
    • Look for the last_local_down_diag field to confirm it is indeed Control Detection Time Expired.
  4. Review Physical Switch Logs
    • Coordinate with the network team to inspect the physical switch ports connected to the ESXi hosts. Look for:
      • Interface CRC errors or discards.
      • MTU mismatches (ensure a minimum of 1600 MTU for Geneve/TEP traffic if applicable).
      • High CPU or memory utilization on the physical switches during the flap.
  5. Adjust BFD Timers (Optional Mitigation)

 

If the underlay experiences occasional minor jitter, consider increasing the BFD multiplier or transmission interval to provide a larger detection window.

Note: Increasing timers will result in slower failover times in a genuine failure sc