Title: Alarm to indicate that the status of the routing
subsystem on the edge node is down.
Event ID: routing_down
Alarm description:
routing
subsystem on the edge node is down.routing
subsystem on the edge node is down, no communication can be carried out of the edge node and would hence cause complete traffic loss.VMware NSX-T Data Center
VMware NSX
routing
subsystem drives the communication to the external domain. routing
subsystem principally works on a service router and uses the networking routing protocols to install and maintain routes to the external domain. routing
subsystem depends on the presence of a BGP, OSPF, or BFD configuration, and in the UP/ESTABLISHED
state on the edge node. DOWN
state, the routing
subsystem is considered to be down.Steps to resolve
For 3.0.0 and higher
EdgeHealthAlarm
, and follow the recommended actions provided in that alarm.Verify if a service router is configured on the edge node using the API GET /policy/api/v1/infra/tier-0s
.
routing
subsystem on the edge node will be down, and communication over the uplinks will not be possible.UP/ESTABLISHED
state for the routing subsystem to be up.Check for the presence of BGP neighbors, and the current state of the sessions using the API GET /policy/api/v1/infra/tier-0s/{tier-0-id}/locale-services/{locale-service-id}/bgp/neighbors/status
In the above command, replace tier-0-id
with the actual name of the T0 gateway and locale-service-id
is usually represented as default
ESTABLISHED
, check for the presence of the bgp_down
alarm, and follow the recommended actions provided in that alarm.Check for the presence of OSPF neighbor sessions using the API GET /policy/api/v1/infra/tier-0s/{tier-0-id}/locale-services/{locale-service-id}/ospf/neighbors
.
tier-0-id
with the actual name of the T0 gateway and locale-service-id
is usually represented as default
FULL
, check for the presence of the ospf_neighbor_went_down
alarm, and follow the recommended actions provided in that alarm.Invoke the NSX CLI command get logical-routers
.
{sr_id}
using the NSX CLI command vrf
vrf_id_of_service_router
. Invoke the NSX CLI command get ospf neighbor
to obtain the Neighbor ID and check the current State
.
Invoke the NSX CLI command get logical-routers
.
{sr_id}
using the NSX CLI command vrf
vrf_id_of_service_router
. Invoke the NSX CLI command get bfd-sessions
and verify if any of the session's current state is UP
.
UP
, check for the presence of the bfd_down_on_external_interface
alarm, and follow the recommended actions provided in that alarm.Is a maintenance window required for remediation?
No