Troubleshooting BGP on NSX-T Edge Nodes
search cancel

Troubleshooting BGP on NSX-T Edge Nodes

book

Article ID: 339392

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

When troubleshooting BGP sessions there are a few things to check and consider. This articles examines the different areas to verify, validate and troubleshoot a BGP session.

Environment

VMware NSX-T
VMware NSX

Cause

There are several reasons why BGP sessions may not get established. The following are the most common reasons:

  • No communication between peers.
  • Timers mismatch.
  • BFD configuration mismatched.

Resolution

On your BGP Troubleshooting session, here are the first few things to check and consider:

  • Identify which interfaces are involved in peering and which BGP states are involved. 

    Which BGP state (Idle; Connect; Active; OpenSent; OpenConfirm; Established) are the peers in, or cycling between?
    • Check in the UI → Is peering between T0 SR and physical router?
    • Has peering ever been stable in the Established state?

  • Places to check in the NSX-T UI
      • Networking > Tier-0 Gateways > Click three dots ellipsis > Select 'Generate BGP Summary'
        • This shows all of the peering relationships which have been configured on a T0, and their Connection Status (BGP state)
      • Networking > Tier-0 Gateways > expand BGP section > Click blue number of BGP Neighbors 
        • Expand to show BFD / Keep Alive / Hold Timers 
        • Select 'i' next to Status to see general peering information similar to Generate BGP Summary above

    • Commands used during troubleshooting within T0 VRF on Edge 
      • nsx-t-edge > get logical-router
      • find tier0_sr vrf id
      • nsx-t-edge > vrf <t0_sr_vrf_id>
      • nsx-t-edge(tier0_sr)> get bgp neighbor summary
      • nsx-t-edge(tier0_sr)> get bgp neighbor ipv4
      • nsx-t-edge(tier0_sr)> get bgp neighbor advertised-routes (only if connection is in Established state)
      • nsx-t-edge(tier0_sr)> ping <bgp_neighbor> (a successful ping indicates healthy underlay network)

    • Retrieve Edge and Manager log bundles  
      • Edge log files to review
        • var/log/frr/frr.log - grep for remote peer IP or with "NOTIFICATION" and "ADJCHANGE" if there are multiple peers and need to filter adjacency change activity.
        •  var/log/syslog - grep for "state=BGP" to view state changes 
        • <Edge bundle>/edge/frr_show_ip_bgp_neighbors_json
        • <Edge bundle>/edge/frr_show_ip_bgp_summary_json
        • <Edge bundle>/edge/tier0_sr_get_bgp_neighbor

Additional Information

Resources/Documentation Link
Cisco BGP Essential Training BGP Essential Training
VMware NSX-T Admin Guide Configure BGP
BGP session diagnostics for troubleshooting BGP session flaps on NSX-T edge node BGP session diagnostics for troubleshooting BGP session flaps on NSX-T edge node
NSX Reference Design

https://community.broadcom.com/viewdocument/nsx-reference-design-guide-42-v10

 

 

Logs

Set debug logs on BGP

From inside the T0 VRF
1. set debug
2. set routing debug bgp all
3. get routing debug bgp

After debugging is complete, To disable debug logs:

1. clear routing debug bgp all
2. clear debug

 

If you are contacting Broadcom support about this issue, please provide the following:

  • State of the BGP connection reported on peer device
  • Are you able to ping the peer device from the T0 SR
  • How long as the session reported down/has this ever worked?
  • BGP configuration on peer device
  • State of the physical network

Handling Log Bundles for offline review with Broadcom support