BGP neighborship is down alarm
search cancel

BGP neighborship is down alarm

book

Article ID: 369185

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Title: Alarm to indicate that the status of the BGP neighborship is down.

Event ID: bgp_down

Alarm Description:

  • Purpose: To notify the admin that the BGP neighborship went down.
  • Impact: BGP neighborship going down would cause the routes from the peer to be withdrawn and cause potential traffic disruptions within the network.

Environment

VMware NSX-T Data Center

Resolution

Steps to Resolve
For 3.0.0 and higher
 
Recommended Action: 
  • Check for connectivity-related issues:
    • Perform a ping test from the edge node interface or the external peer.
      • To obtain the configured BGP neighbor_address use the API GET /policy/api/v1/infra/tier-0s/<tier-0-id>/locale-services/<locale-service-id>/bgp/neighbors.

        Example Response: BgpNeighborship
        {
          "sort_ascending"true,
          "sort_by""display_name",
          "result_count": 1,
          "results": [
            {
              "resource_type""BgpNeighborConfig",
              "id""peer1",
              "display_name""peer1",
              "path""/infra/tier-0s/vmc_prv/locale-services/default/bgp/neighbors/peer1",
              "parent_path""/infra/tier-0s/vmc_prv/locale-services/default/bgp",
              "relative_path""peer1",
              "password""test",
              "remote_as_num""12",
              "neighbor_address""##.##.##.##", ------------------------------------------------> Configured Neighbor Address
              "graceful_restart_mode" "DISABLE",
              "in_route_filters": [ "/infra/tier-0s/vmc/prefix-lists/plist-1" ],
              "_create_user""admin",
              "_create_time": 1516751318020,
              "_last_modified_user""admin",
              "_last_modified_time": 1516751318020,
              "_system_owned"false,
              "_protection""NOT_PROTECTED",
              "_revision": 0
            }
          ]
        }
      • Invoke the NSX CLI command get logical-routers.

        get logical-routers
        Edge1> get logical-routers
        Logical Router
        UUID                                   VRF    LR-ID  Name                              Type                        Ports   Neighbors
        ########-####-####-####-############   0      0                                        TUNNEL                      4       10/5000
        ########-####-####-####-############   1      3      SR-tier0                          SERVICE_ROUTER_TIER0        6       0/50000
        ########-####-####-####-############   3      1      DR-tier0                          DISTRIBUTED_ROUTER_TIER0    6       2/50000
      • Switch to the service router {sr_id} using the NSX CLI command vrf vrf_id_of_service_router

      • Invoke the NSX CLI command get route and ensure a valid route exists in the routing table for the peer.
      • Invoke the NSX CLI command ping {neighbor_address}. 
      • If the ping fails
        • Check the VLAN on the segment/Edge logical uplink and the VLAN on the external peer interface. If the VLAN configuration does not match, ping is expected to fail. 
        • Identify the correct VLAN to be configured and ensure it is configured on the edge segment/logical uplink and the interface on the external peer connecting to the edge.
        • To check the VLAN configured on the uplink interface of the edge, use the API GET /policy/api/v1/infra/segments/{segment-id}.

          Sample output: GET /policy/api/v1/infra/segments/{segment-id}
          {
              "type""DISCONNECTED",
              "vlan_ids": [
                  "5"
              ],
              "transport_zone_path""/infra/sites/default/enforcement-points/default/transport-zones/########-####-####-####-############",
              "advanced_config": {
                  "hybrid"false,
                  "multicast"true,
                  "inter_router"false,
                  "local_egress"false,
                  "urpf_mode""STRICT",
                  "connectivity""ON"
              },
              "admin_state""UP",
              "replication_mode""MTEP",
              "resource_type""Segment",
              "id""tier0-vrfA-uplink-##.##.##.##",
              "display_name""tier0-vrfA-uplink-##.##.##.##",
              "path""/infra/segments/tier0-vrfA-uplink-##.##.##.##",
              "relative_path""tier0-vrfA-uplink-##.##.##.##",
              "parent_path""/infra",
              "unique_id""########-####-####-####-############",
              "realization_id""########-####-####-####-############",
              "owner_id""########-####-####-####-############",
              "marked_for_delete"false,
              "overridden"false,
              "_system_owned"false,
              "_create_time": 1712865979425,
              "_create_user""admin",
              "_last_modified_time": 1712865979425,
              "_last_modified_user""admin",
              "_protection""NOT_PROTECTED",
              "_revision": 0
          }
      • If the ping is successful and the BGP session remains down: 
        • Check for any firewall rules configured to block the BGP control packets.
        • Check the BGP config on the peer.
  • Check for MTU related Issues:
    • Check the MTU setting on the TOR interface connected to the physical NIC of the DVS uplink which provides connectivity to the Tier-0 uplink.
    • Refer to Guidance to Set Maximum Transmission Unit
    • Follow procedures in the KB article #97120 to address common NSX underlying infrastructure connectivity issues.

Check for configuration-related issues:

  • For the configured BGP neighbor, verify if the neighbor address, AS number, remote AS, keepalive timer, hold timers, and password, if configured, are configured correctly on the edge node and the external peer. 
  • Ensure the neighbor admin state is enabled.
  • To verify the neighbor configuration, use the API GET /policy/api/v1/infra/tier-0s/<tier-0-id>/locale-services/<locale-service-id>/bgp/neighbors

    Sample Response: GET /policy/api/v1/infra/tier-0s/<tier-0-id>/locale-services/<locale-service-id>/bgp/neighbors
    {
        "source_addresses": [
            "##.##.##.##",
            "##.##.##.##"
        ],
        "neighbor_address""##.##.##.##", ---------------> Configured Neighbor Address
        "remote_as_num""420", --------------------------> Remote AS
        "route_filtering": [
            {
                "enabled"true,
                "address_family""IPV4"
            }
        ],
        "keep_alive_time": 1, ----------------------------> Keep-Alive Timer
        "hold_down_time": 3,------------------------------> Hold Down Interval
        "bfd": {
            "enabled"true,
            "interval": 500,
            "multiple": 3
        },
        "allow_as_in"false,
        "maximum_hop_limit": 1,
        "password_set"false, ---------------------------> Password - Not Set in Example
        "enabled"true, ---------------------------------> Configured Admin State
        "resource_type""BgpNeighborConfig",
        "id""##.##.##.##",
        "display_name""##.##.##.##",
        "path""/infra/tier-0s/tier0VrfA/locale-services/VRFA_tier0localeservices/bgp/neighbors/##.##.##.##",
        "relative_path""##.##.##.##",
        "parent_path""/infra/tier-0s/tier0VrfA/locale-services/VRFA_tier0localeservices/bgp",
        "unique_id""########-####-####-####-############",
        "realization_id""########-####-####-####-############",
        "owner_id""########-####-####-####-############",
        "marked_for_delete"false,
        "overridden"false,
        "_system_owned"false,
        "_create_time": 1712868927242,
        "_create_user""admin",
        "_last_modified_time": 1712943952559,
        "_last_modified_user""admin",
        "_protection""NOT_PROTECTED",
        "_revision": 1
    }

    Is a maintenance window required for remediation? No