Title: Alarm to indicate that the status of the BFD session on the external interface is down.
Event ID: bfd_down_on_external_interface
Alarm description:
VMware NSX-T Data Center
VMware NSX
Steps to resolve:
For 3.0.0 and higher
Recommended Action:
Edge1> get logical-routersLogical RouterUUID VRF LR-ID Name Type Ports Neighbors########-####-####-####-###########0 0 TUNNEL 4 10/50001 3 SR-tier0 SERVICE_ROUTER_TIER0 6 0/50000########-####-####-####-###########3 1 DR-tier0 DISTRIBUTED_ROUTER_TIER0 6 2/50000########-####-####-####-###########
{sr_id} using the NSX CLI command vrf vrf_id_of_service_router.GET /policy/api/v1/global-infra/tier-0s/{tier-0-id}/locale-services/{locale-service-id}/bgp/neighbors.GET /policy/api/v1/global-infra/tier-0s/{tier-0-id}/locale-services/{locale-service-id}/bgp/neighbors{"source_addresses": ["##.###.##.##.###","##.###.##.##.###"],"neighbor_address": "##.###.##.##.###","remote_as_num": "420","route_filtering": [{"enabled": true,"address_family": "IPV4"}],"keep_alive_time": 1,"hold_down_time": 3,"bfd": {"enabled": true, ---------> Admin State"interval": 500, ---------> Keepalive Interval"multiple": 3. ---------> Multiplier},"allow_as_in": false,"maximum_hop_limit": 1,"password_set": false,"enabled": true,"resource_type": "BgpNeighborConfig","id": "##.###.##.##.###","display_name": "##.###.##.##.###","path": "/infra/tier-0s/tier0VrfA/locale-services/VRFA_tier0localeservices/bgp/neighbors/##.###.##.##.###","relative_path": "##.###.##.##.###","parent_path": "/infra/tier-0s/tier0VrfA/locale-services/VRFA_tier0localeservices/bgp","unique_id": "########-####-####-####-###########","realization_id": "########-####-####-####-###########","owner_id": "########-####-####-####-###########","marked_for_delete": false,"overridden": false,"_system_owned": false,"_create_time": 1712868927242,"_create_user": "admin","_last_modified_time": 1712943952559,"_last_modified_user": "admin","_protection": "NOT_PROTECTED","_revision": 1}
local_address, remote_address and destination_port.local_discriptor value is set for the transmitted packet and the remote_discriptor value is updated in the output.
remote_discriptor value is not populated, the issue could be due to BFD packets being unable to reach the edge. Follow the Check for connectivity-related issues section below. remote_discriptor value is present and the session remains in the Down state check the diag code.| Diag Code | Description | Action |
| Control Detection Time Expired | The BFD rx_interval timer expired, and the end reporting the expiration declares the session down. |
|
| Neighbor Signaled Session Down | Peer voluntarily brings down session when local BFD is up. | Check peer BFD configuration. |
| Administratively Down | BFD session on the edge is not enabled | Enable BFD session on the edge. |
Edge1(tier0_vrf_sr[7])> get bfd-sessionsBFD SessionDest_port : 3784 -----------------------------------> Destination PortDiag : No DiagnosticEncap : vlanForwarding : last true (current true)Interface :########-####-####-####-###########Intf_type : LR_PORTKeep-down : falseLast_admin_down_diag_time : 2024-04-17 13:15:18Last_cp_diag : No DiagnosticLast_cp_rmt_diag : No DiagnosticLast_cp_rmt_state : upLast_cp_state : upLast_down_time : 2024-04-17 13:15:18Last_fwd_state : UPLast_local_down_diag : Neighbor Signaled Session Down ---------> Edge Diag CodeLast_remote_admin_down_time : 2024-04-17 13:15:18Last_remote_down_diag : Administratively DownLast_up_time : 2024-04-17 13:15:19Local_address :##.###.##.##.###-----------------------------> Local AddressLocal_discr : 673456400 ------------------------------> Local DiscriptorMin_rx_ttl : 255Multiplier : 3Received_remote_diag : No DiagnosticReceived_remote_state : upRemote_address :##.###.##.##.###----------------------------> Remote AddressRemote_admin_down : falseRemote_diag : No DiagnosticRemote_discr : 4097 -----------------------------------> Remote DiscriptorRemote_min_rx_interval : 1000Remote_min_tx_interval : 1000Remote_multiplier : 3Remote_state : upRouter :########-####-####-####-###########Router_down : falseRx_cfg_min : 500Rx_interval : 1000Service-link : falseSession_type : UPLINKState : up -------------------------------------> StateTx_cfg_min : 500 ------------------------------------> Configured Transmit Min IntervalTx_interval : 1000 -----------------------------------> Transmit IntervalType : IPv4
Edge1> get logical-routersLogical RouterUUID VRF LR-ID Name Type Ports Neighbors########-####-####-####-###########0 0 TUNNEL 4 10/5000########-####-####-####-###########1 3 SR-tier0 SERVICE_ROUTER_TIER0 6 0/50000########-####-####-####-###########3 1 DR-tier0 DISTRIBUTED_ROUTER_TIER0 6 2/50000
{sr_id} using the NSX CLI command vrf vrf_id_of_service_router. {peer_address}.GET /policy/api/v1/infra/segments/{segment-id}
Sample output:GET /policy/api/v1/infra/segments/{segment-id}{"type": "DISCONNECTED","vlan_ids": ["5"],"transport_zone_path": "/infra/sites/default/enforcement-points/default/transport-zones/8fc4a476-c2cc-4d8c-866d-eff780627ea9","advanced_config": {"hybrid": false,"multicast": true,"inter_router": false,"local_egress": false,"urpf_mode": "STRICT","connectivity": "ON"},"admin_state": "UP","replication_mode": "MTEP","resource_type": "Segment","id": "<tier0-vrfA-uplink>","display_name": "<tier0-vrfA-uplink>","path": "/infra/segments/<tier0-vrfA-uplink>","relative_path": "<tier0-vrfA-uplink>","parent_path": "/infra","unique_id": "########-####-####-####-###########","realization_id": "########-####-####-####-###########","owner_id": "########-####-####-####-###########","marked_for_delete": false,"overridden": false,"_system_owned": false,"_create_time": 1712865979425,"_create_user": "admin","_last_modified_time": 1712865979425,"_last_modified_user": "admin","_protection": "NOT_PROTECTED","_revision": 0}
Check if there is mismatch in BFD single-hop/ multihop configurations :
If the external peer is configured with Multihop BFD and the NSX Edge is configured with single-hop BFD and vice versa, this 'BFD session on the external interface is down alarm' may be generated.
The BFD session will go down due to a BFD Port mismatch. Single-hop BFD uses 3784 and multi-hop BFD uses 4784 ports.
This can be identified by performing network traffic analysis at the edge to inspect the packet flows. If it is observed that traffic is utilizing destination port 4784 for the outbound flow and destination port 3784 for the inbound return flow, then in indicates BFD Port mismatch.
For example-
05:48:56.047436 #:#:#:#:ce:d4 > #:#:#:#:d6:7e, ethertype IPv4 (0x0800), length 66: 10.#.#.3.51763 > 10.#.#.10.4784: BFDv1, Multihop, State Down, Flags: [none], length: 24
05:48:56.163949 #:#:#:#:d6:7e > #:#:#:#:ce:d4, ethertype IPv4 (0x0800), length 66: 10.#.#.10.49185 > 10.#.#.3.3784: BFDv1, Control, State Down, Flags: [none], length: 24