Alert Definition Name: Operations for Networks-NSX-T Host Node Tunnel Status is 'Degraded'
search cancel

Alert Definition Name: Operations for Networks-NSX-T Host Node Tunnel Status is 'Degraded'

book

Article ID: 429070

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • After an Edge Node transitions to the Active role, the HA state is seen flapping between the Edge Nodes resulting in intermittent traffic loss and instability.

  • The NSX Edge logs (specifically datapathd logs) show the BFD session for the uplink interface repeatedly transitioning to a down state with the specific diagnostic
    2026-##-####:##:##.#### edge-xxxx.xxxxx.xxxx NSX 1045207 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="bfd" tname="dp-bfd-mon4" level="INFO"] x.x.x.x->x.x.x.x/vlan: BFD state change: up->down "No Diagnostic"->"Neighbor Signaled Session Down".

  • The corresponding BGP session goes down almost immediately, triggering the HA failover/failback mechanism, which causes the continuous flapping
    2026:##:##.###Z edge-1... NSX 1044271 FABRIC [nsx@6876 comp="nsx-edge" subcomp="rcpm" s2comp="routing-service-realization" level="INFO"] Alarm for BGP ##.##.##.##, state=BGP_DOWN

Environment

VMware NSX - 4.X

Cause

The NSX Edge BFD protocol operates strictly according to RFC 5880. The diagnostic value "Neighbor Signaled Session Down" (Diagnostic Code 3) is set by the remote system and indicates that the remote system has itself decided to tear down the BFD session. The root cause is therefore external to the NSX Edge and lies with the immediate physical neighbor on the uplink path.

Resolution

This issue is primarily a physical network problem and must be investigated on the immediate BGP/BFD neighbor .
The points below provide a set of troubleshooting guidelines.

  • Check BFD/Neighbor configuration
  • Interface errors
  • If a BFD session is not strictly required, disabling it for the affected BGP neighbor would stop the rapid flapping and stabilize the HA state. This is only recommended as a temporary measure if immediate access to the neighbor device is unavailable.

Additional Information

Troubleshooting_Edge_BFD_Tunnels_down