Static route is removed because the BFD session configured for that static route is down alarm
search cancel

Static route is removed because the BFD session configured for that static route is down alarm

book

Article ID: 369184

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Title: Alarm to indicate that a static route is removed because the BFD session configured for that static route is down.
Event ID: static_routing_removed

Alarm description:

  • Purpose: To notify the removal of a static route to the user when the BFD session configured for that static route goes down. This alarm applies only to the static routes backed by BFD.
  • Impact: Removal of static routes could lead to traffic disruptions.

Environment

VMware NSX-T Data Center
VMware NSX

Resolution

Recommended Action:

  • Check for configuration-related issues:
    • Check next-hop of the static route and find the associated BFD peer configuration.
    • Verify if the source and destination addresses for the BFD session are correct.
    • Invoke the NSX CLI command: get logical-routers

      Sample CLI output: get logical-routers
      Edge1> get logical-routers
      Logical Router
      UUID                                 VRF    LR-ID  Name                              Type                        Ports   Neighbors
      ########-####-####-####-###########   0      0                                        TUNNEL                      4       10/5000
      ########-####-####-####-###########   1      3      SR-tier0                          SERVICE_ROUTER_TIER0        6       0/50000
      ########-####-####-####-###########   3      1      DR-tier0                          DISTRIBUTED_ROUTER_TIER0    6       2/50000
    • Switch to the LR-ID of the Service Router using the NSX CLI command vrf vrf_id_of_service_router
    • Invoke the NSX CLI command: get bfd-sessions
    • Verify the local_address, remote_address and Dest_port (destination_port.)
    • Check if the local_discriptor Local_discr (value is set for the transmitted packet and the remote_discriptor (Remote_discr) value is updated in the output.
      • If the remote_discriptor value is not populated, the issue could be due to BFD packets being unable to reach the NSX Edge node. Follow the Check for connectivity-related issues section below.
      • If the remote_discriptor value is present and the session remains in the Down state check the diag code.

        Diag Code Description Action
        Control Detection Time Expired The BFD timer expired, and the end reporting the expiration declares the session down.
        • Check if the BFD timer is too aggressive for system load and path traffic load. Aggressive timers may cause BFD flaps in busy systems.
        • The default BFD timer is 1 sec, and the detection time multiplier is 3.
        Neighbor Signaled Session Down Peer voluntarily brings down session when local BFD is up. Check peer BFD configuration.
        Administratively Down BFD session on the NSX Edge node is not enabled. Enable BFD session on the NSX Edge node.

        Sample CLI output: get bfd-session
        <Edge-Name>(tier0_vrf_sr[7])> get bfd-sessions
        BFD Session
        Dest_port                     : 3784 -----------------------------------> Destination Port
        Diag                          : No Diagnostic
        Encap                         : vlan
        Forwarding                    : last true (current true)
        Interface                     : ########-####-####-####-############
        Intf_type                     : LR_PORT
        Keep-down                     : false
        Last_admin_down_diag_time     : 2024-04-17 13:15:18
        Last_cp_diag                  : No Diagnostic
        Last_cp_rmt_diag              : No Diagnostic
        Last_cp_rmt_state             : up
        Last_cp_state                 : up
        Last_down_time                : 2024-04-17 13:15:18
        Last_fwd_state                : UP
        Last_local_down_diag          : Neighbor Signaled Session Down ---------> Edge Diag Code
        Last_remote_admin_down_time   : 2024-04-17 13:15:18
        Last_remote_down_diag         : Administratively Down
        Last_up_time                  : 2024-04-17 13:15:19
        Local_address                 : ##.##.##.## -----------------------------> Local Address
        Local_discr                   : 673456400 ------------------------------> Local Discriptor
        Min_rx_ttl                    : 255
        Multiplier                    : 3
        Received_remote_diag          : No Diagnostic
        Received_remote_state         : up
        Remote_address                : ##.##.##.## ----------------------------> Remote Address
        Remote_admin_down             : false
        Remote_diag                   : No Diagnostic
        Remote_discr                  : 4097 -----------------------------------> Remote Discriptor
        Remote_min_rx_interval        : 1000
        Remote_min_tx_interval        : 1000
        Remote_multiplier             : 3
        Remote_state                  : up
        Router                        : ########-####-####-####-############
        Router_down                   : false
        Rx_cfg_min                    : 500
        Rx_interval                   : 1000
        Service-link                  : false
        Session_type                  : UPLINK
        State                         : up -------------------------------------> State
        Tx_cfg_min                    : 500 ------------------------------------> Configured Transmit Min Interval
        Tx_interval                   : 1000 -----------------------------------> Transmit Interval
        Type                          : IPv4

 

  • Check for connectivity-related issues:
    • Invoke the NSX CLI command: get logical-routers

      Sample CLI output: get logical-routers
      <Edge-Name> get logical-routers
      Logical Router
      UUID                                   VRF    LR-ID  Name                              Type                        Ports   Neighbors
      ########-####-####-####-############   0      0                                        TUNNEL                      4       10/5000
      ########-####-####-####-############   1      3      SR-tier0                          SERVICE_ROUTER_TIER0        6       0/50000
      ########-####-####-####-############   3      1      DR-tier0                          DISTRIBUTED_ROUTER_TIER0    6       2/50000
    • Switch to LR-ID of the Service Router using the NSX CLI command vrf vrf_id_of_service_router
    • Invoke the NSX CLI command get bfd-sessions to obtain the local_address and the remote_address.
    • Invoke the NSX CLI command get route and ensure a valid route exists in the routing table for the peer.
    • Invoke the NSX CLI command ping peer_address
    • If the ping fails:
      • Check the VLAN on the segment/Edge logical uplink and the VLAN on the external peer interface. If the VLAN configuration does not match, ping is expected to fail.
      • Identify the correct VLAN to be configured and ensure it is configured on the segment/logical uplink and the interface on the external peer connecting to the NSX Edge node.
      • To check the VLAN configured on the uplink interface of the NSX Edge node, use the API GET /policy/api/v1/infra/segments/{segment-id}

        Sample output: GET /policy/api/v1/infra/segments/{segment-id}
        {
          "type": "DISCONNECTED",
          "vlan_ids": [
              "5"
          ],
          "transport_zone_path": "/infra/sites/default/enforcement-points/default/transport-zones/########-####-####-####-############",
          "advanced_config": {
              "hybrid": false,
              "multicast": true,
              "inter_router": false,
              "local_egress": false,
              "urpf_mode": "STRICT",
              "connectivity": "ON"
          },
          "admin_state": "UP",
          "replication_mode": "MTEP",
          "resource_type": "Segment",
          "id": <tier0-vrfA-uplink>,
          "display_name": <tier0-vrfA-uplink>,
          "path": "/infra/segments/<tier0-vrfA-uplink>,
          "relative_path": <tier0-vrfA-uplink>,
          "parent_path": "/infra",
          "unique_id": "########-####-####-####-############",
          "realization_id": "########-####-####-####-############",
          "owner_id": "########-####-####-####-############",
          "marked_for_delete": false,
          "overridden": false,
          "_system_owned": false,
          "_create_time": 1712865979425,
          "_create_user": "admin",
          "_last_modified_time": 1712865979425,
          "_last_modified_user": "admin",
          "_protection": "NOT_PROTECTED",
          "_revision": 0
        }
    • If the ping is successful and the BFD session remains in the DOWN state:
      • Check for any firewall rules configured to block the BFD control packets.
      • Check the BFD config on the peer.


Is a maintenance window required for remediation? No