Static route is removed because the BFD session configured for that static route is down alarm
search cancel

Static route is removed because the BFD session configured for that static route is down alarm

book

Article ID: 369184

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Title: Alarm to indicate that a static route is removed because the BFD session configured for that static route is down.

Event ID: static_routing_removed

Alarm description:

  • Purpose: To notify the removal of a static route to the user when the BFD session configured for that static route goes down. This alarm applies only to the static routes backed by BFD.
  • Impact: Removal of static routes could lead to traffic disruptions.

Environment

VMware NSX-T Data Center

Resolution

Steps to Resolve
For 3.0.0 and higher
 
Recommended Action:
  • Check for configuration-related issues:
    • Check next-hop of the static route and find the associated BFD peer configuration
    • Verify if the source and destination addresses for the BFD session are correct.
    • Invoke the NSX CLI command get logical-routers.

      Sample CLI output: get logical-routers
      Edge1> get logical-routers
      Logical Router
      UUID                                   VRF    LR-ID  Name                              Type                        Ports   Neighbors
      ########-####-####-####-############   0      0                                        TUNNEL                      4       10/5000
      ########-####-####-####-############   1      3      SR-tier0                          SERVICE_ROUTER_TIER0        6       0/50000
      ########-####-####-####-############   3      1      DR-tier0                          DISTRIBUTED_ROUTER_TIER0    6       2/50000
    • Switch to the service router {sr_id} using the NSX CLI command vrf vrf_id_of_service_router.
    • Invoke the NSX CLI command get bfd-sessions and verify if the local_addressremote_address and destination_port.

    • Check if the local_discriptor value is set for the transmitted packet and the remote_discriptor value is updated in the output.
      • If the remote_discriptor value is not populated, the issue could be due to BFD packets being unable to reach the edge. Follow the Check for connectivity-related issues section below.
      • If the remote_discriptor value is present and the session remains in the Down state check the diag code.

        Diag Code
        Description
        Action
        Control Detection Time Expired The BFD timer expired, and the end
        reporting the expiration declares
        the session down.
        • Check if the BFD timer is too aggressive for system load and path traffic load. Aggressive timers may cause BFD flaps in busy systems
        • The default BFD timer is 1 sec, and the detection time multiplier is 3.
        Neighbor Signaled Session Down Peer voluntarily brings down session
        when local BFD is up. 
        Check peer BFD configuration 
        Administratively Down BFD session on the edge is not enabled Enable BFD session on the edge.
        Sample CLI output: get bfd-session
        Edge1(tier0_vrf_sr[7])> get bfd-sessions
        BFD Session
        Dest_port                     : 3784 -----------------------------------> Destination Port
        Diag                          : No Diagnostic
        Encap                         : vlan
        Forwarding                    : last true (current true)
        Interface                     : ########-####-####-####-############
        Intf_type                     : LR_PORT
        Keep-down                     : false
        Last_admin_down_diag_time     : 2024-04-17 13:15:18
        Last_cp_diag                  : No Diagnostic
        Last_cp_rmt_diag              : No Diagnostic
        Last_cp_rmt_state             : up
        Last_cp_state                 : up
        Last_down_time                : 2024-04-17 13:15:18
        Last_fwd_state                : UP
        Last_local_down_diag          : Neighbor Signaled Session Down ---------> Edge Diag Code
        Last_remote_admin_down_time   : 2024-04-17 13:15:18
        Last_remote_down_diag         : Administratively Down
        Last_up_time                  : 2024-04-17 13:15:19
        Local_address                 : ##.##.##.## -----------------------------> Local Address
        Local_discr                   : 673456400 ------------------------------> Local Discriptor
        Min_rx_ttl                    : 255
        Multiplier                    : 3
        Received_remote_diag          : No Diagnostic
        Received_remote_state         : up
        Remote_address                : ##.##.##.## ----------------------------> Remote Address
        Remote_admin_down             : false
        Remote_diag                   : No Diagnostic
        Remote_discr                  : 4097 -----------------------------------> Remote Discriptor
        Remote_min_rx_interval        : 1000
        Remote_min_tx_interval        : 1000
        Remote_multiplier             : 3
        Remote_state                  : up
        Router                        : addcbd64-9713-4205-a746-0adadbe3073f
        Router_down                   : false
        Rx_cfg_min                    : 500
        Rx_interval                   : 1000
        Service-link                  : false
        Session_type                  : UPLINK
        State                         : up -------------------------------------> State
        Tx_cfg_min                    : 500 ------------------------------------> Configured Transmit Min Interval
        Tx_interval                   : 1000 -----------------------------------> Transmit Interval
        Type                          : IPv4
  • Check for connectivity-related issues:
    • Invoke the NSX CLI command get logical-routers.

      get logical-routers
      Edge1> get logical-routers
      Logical Router
      UUID                                   VRF    LR-ID  Name                              Type                        Ports   Neighbors
      ########-####-####-####-############   0      0                                        TUNNEL                      4       10/5000
      ########-####-####-####-############   1      3      SR-tier0                          SERVICE_ROUTER_TIER0        6       0/50000
      ########-####-####-####-############   3      1      DR-tier0                          DISTRIBUTED_ROUTER_TIER0    6       2/50000
    • Switch to the service router {sr_id} using the NSX CLI command vrf vrf_id_of_service_router

    • Invoke the NSX CLI command get bfd-sessions to obtain the local_address and the remote_address.

    • Invoke the NSX CLI command get route and ensure a valid route exists in the routing table for the peer.
    • Invoke the NSX CLI command ping {peer_address}.
    • If the ping fails
      • Check the VLAN on the segment/Edge logical uplink and the VLAN on the external peer interface. If the VLAN configuration does not match, ping is expected to fail. 
      • Identify the correct VLAN to be configured and ensure it is configured on the edge segment/logical uplink and the interface on the external peer connecting to the edge.
      • To check the VLAN configured on the uplink interface of the edge, use the API GET /policy/api/v1/infra/segments/{segment-id}.

        Sample output: GET /policy/api/v1/infra/segments/{segment-id}
        {
            "type""DISCONNECTED",
            "vlan_ids": [
                "5"
            ],
            "transport_zone_path""/infra/sites/default/enforcement-points/default/transport-zones/########-####-####-####-############",
            "advanced_config": {
                "hybrid"false,
                "multicast"true,
                "inter_router"false,
                "local_egress"false,
                "urpf_mode""STRICT",
                "connectivity""ON"
            },
            "admin_state""UP",
            "replication_mode""MTEP",
            "resource_type""Segment",
            "id": <tier0-vrfA-uplink>,
            "display_name": <tier0-vrfA-uplink>,
            "path""/infra/segments/<tier0-vrfA-uplink>,
            "relative_path": <tier0-vrfA-uplink>,
            "parent_path""/infra",
            "unique_id""########-####-####-####-############",
            "realization_id""########-####-####-####-############",
            "owner_id""########-####-####-####-############",
            "marked_for_delete"false,
            "overridden"false,
            "_system_owned"false,
            "_create_time": 1712865979425,
            "_create_user""admin",
            "_last_modified_time": 1712865979425,
            "_last_modified_user""admin",
            "_protection""NOT_PROTECTED",
            "_revision": 0
        }
    • If the ping is successful and the BFD session remains in the DOWN state: 
      • Check for any firewall rules configured to block the BFD control packets.
      • Check the BFD config on the peer.

Is a maintenance window required for remediation? No