OSPF neighborship changed from the FULL to the DOWN state alarm
search cancel

OSPF neighborship changed from the FULL to the DOWN state alarm

book

Article ID: 369206

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Title: Alarm to indicate the status of the OSPF neighborship changed from the FULL to the DOWN state.
Event ID: ospf_neighbor_went_down
Alarm description:

  • Purpose: To notify the admin about the change in the OSPF neighborship state from the FULL state to the DOWN state.
  • Impact: The change in the OSPF neighborship to the DOWN state would cause traffic disruptions.##.##.##.##

Environment

VMware NSX-T Data Center
VMware NSX

Resolution

Steps to resolve
For 3.1.1 and higher

Recommended Action:

Please follow the steps below to resolve the alarm.

  • Check for connectivity-related issues: 
    • Perform a ping test from the edge node interface or the external peer.
      • Invoke the NSX CLI command get logical-routers.

        Sample CLI output: get logical-routers
        Edge1> get logical-routers
        Logical Router
        UUID                                   VRF    LR-ID  Name                              Type                        Ports   Neighbors
        ########-####-####-####-############   0      0                                        TUNNEL                      4       10/5000
        ########-####-####-####-############   1      3      SR-tier0                          SERVICE_ROUTER_TIER0        6       0/50000
        ########-####-####-####-############   3      1      DR-tier0                          DISTRIBUTED_ROUTER_TIER0    6       2/50000
      • Switch to the service router {sr_id} using the NSX CLI command vrf vrf_id_of_service_router
      • Invoke the NSX CLI command get ospf neighbor to obtain the Neighbor ID.

        Sample CLI output: get ospf neighbor
        nsx-edge-1(tier0_sr)> get ospf neighbor
         
            Neighbor ID     Pri State           Dead Time Address         Interface            RXmtL RqstL DBsmL
            ##.##.##.##       1 Full/DR           30.173s ##.##.##.##     uplink-275:##.##.##.#     0     0     0
      • Invoke the NSX CLI command get route and ensure a valid route exists in the routing table for the Neighbor ID.
      • Invoke the NSX CLI command ping {Neighbor ID}.
      • If the ping fails
        • Check the VLAN on the segment/Edge logical uplink and the VLAN on the external neighbor interface. If the VLAN configuration does not match, ping is expected to fail. 
        • Identify the correct VLAN to be configured and ensure it is configured on the edge segment/logical uplink and the interface on the external neighbor connecting to the edge.
        • To check the VLAN configured on the uplink interface of the edge, use the API GET /policy/api/v1/infra/segments/{segment-id}.

          Sample output: GET /policy/api/v1/infra/segments/{segment-id}
          {
              "type""DISCONNECTED",
              "vlan_ids": [
                  "5"
              ],
              "transport_zone_path""/infra/sites/default/enforcement-points/default/transport-zones/########-####-####-####-############",
              "advanced_config": {
                  "hybrid"false,
                  "multicast"true,
                  "inter_router"false,
                  "local_egress"false,
                  "urpf_mode""STRICT",
                  "connectivity""ON"
              },
              "admin_state""UP",
              "replication_mode""MTEP",
              "resource_type""Segment",
              "id""tier0-vrfA-uplink-##.##.##.#",
              "display_name""tier0-vrfA-uplink-##.##.##.#",
              "path""/infra/segments/tier0-vrfA-uplink-##.##.##.#",
              "relative_path""tier0-vrfA-uplink-##.##.##.#",
              "parent_path""/infra",
              "unique_id""########-####-####-####-############",
              "realization_id""########-####-####-####-############",
              "owner_id""########-####-####-####-############",
              "marked_for_delete"false,
              "overridden"false,
              "_system_owned"false,
              "_create_time": 1712865979425,
              "_create_user""admin",
              "_last_modified_time": 1712865979425,
              "_last_modified_user""admin",
              "_protection""NOT_PROTECTED",
              "_revision": 0
          }
      • If the ping is successful and the OSPF session remains in the DOWN state: 
        • Check for any firewall rules configured to block the OSPF control packets.
        • Check the OSPF config on the peer.
  • Check for MTU related Issues:
    • Check the MTU setting on the TOR interface connected to the physical NIC of the DVS uplink which provides connectivity to the Tier-0 uplink.
    • Refer to Guidance to Set Maximum Transmission Unit
    • Follow procedures in the KB article #97120 to address common NSX underlying infrastructure connectivity issues.
  • Check for configuration-related issues:
    • Verify if the router ID, hello interval, dead interval, and password, if configured, are correctly configured on the OSPF interface on the edge node and the external peer.
    • Invoke the NSX CLI command get logical-routers.
    • Switch to the service router {sr_id} using the NSX CLI command vrf vrf_id_of_service_router
    • Invoke the NSX CLI command get ospf interface to obtain the Internet Address, which is the local address.

      Sample CLI Output: get ospf neighbor
      Edge1(tier0_sr[1])> get ospf interface
      uplink-# is up
        ifindex 21, MTU 1500 bytes, BW 0 Mbit <UP,BROADCAST,RUNNING,MULTICAST>
        Internet Address ##.##.##.#/24, Broadcast ##.##.##.##, Area 0.0.0.0 -------------> Internet Address
        MTU mismatch detection: enabled
        Router ID ##.##.##.#, Network Type BROADCAST, Cost: 10
        Transmit Delay is 1 sec, State DROther, Priority 0
        Designated Router (ID) ##.##.##.##, Interface Address ##.##.##.##
        No backup designated router on this network
        Multicast group memberships: OSPFAllRouters
        Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
          Hello due in 5.434s
        Neighbor Count is 1, Adjacent neighbor count is 1
        Authentication NULL is enabled
    • Check for the state of the OSPF neighbor sessions using the API GET /policy/api/v1/infra/tier-0s/{tier-0-id}/locale-services/{locale-service-id}/ospf/neighbors.

      Example Response: GET /policy/api/v1/infra/tier-0s/{tier-0-id}/locale-services/{locale-service-id}/ospf/neighbors.
      {
          "gateway_path""/infra/tier-0s/tier0",
          "last_update_timestamp": 1713540682834,
          "results": [
              {
                  "edge_path""/infra/sites/default/enforcement-points/default/edge-clusters/########-####-####-####-############/edge-nodes/0",
                  "neighbors": [
                      {
                          "neighbor_address""##.##.##.##", ------------> Neighbor Address
                          "neighbor_status_info": [
                              {
                                  "interface_name""uplink-288:##.##.##.##",
                                  "source_address""##.##.##.##",
                                  "priority": 1,
                                  "state""Full",
                                  "last_state_change""5m19s",
                                  "dead_time""36.348s", -------------------> Dead Time
                                  "retransmit_counter": 0,
                                  "request_counter": 0,
                                  "database_summary_counter": 0
                              }
                          ]
                      }
                  ]
              }
          ],
          "result_count": 1
      }

      NOTE:

      • Area Configuration, password, and MTU should match for OSPF neighborship to be formed.

      • By design, the edge node cannot be configured as a designated router (DR) or backup designated router (BDR).