BFD Tunnels between ESXi hosts and Edges are intermittently going down, with error "Neighbor Signaled Session Down"
search cancel

BFD Tunnels between ESXi hosts and Edges are intermittently going down, with error "Neighbor Signaled Session Down"

book

Article ID: 419036

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • BFD Tunnels between ESXi hosts and Edges are intermittently going down.

  • On the ESXi host, issuing the command nsxdp-cli bfd sessions list, we see then tunnels stuck at Init and Down states.

  • Running the command nsxdp-cli bfd stats get on the ESXi host shows unidirectional packets increment

  • Running the GET API "/nsxapi/api/v1/transport-nodes/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx/tunnels", shows the error Control_Detection_Time_Expired

  • In the NSX UI, the BFD Diagnostic Code for the down tunnels shows "Neighbor Signaled Session Down".

  • The Gateway of the ESXi hosts is a virtual router connected to the same vDS as the NSX Edges & NSX VMK TEPs on the hosts.

Environment

VMware NSX

Cause

When the ESXi TN host’s default gateway is a virtual router connected to the same vDS as the Edge nodes and the host TEP vmkernel interfaces of the hosts, the BFD packets transmitted from the vmkernel interfaces will be forwarded directly from the virtual router (the host gateway) to the Edge node vNICs. Because these packets bypass the ESXi uplink ports on the vDS, this behavior can lead to incorrect or unexpected BFD operation.

Resolution

This is a condition that may occur in a VMware NSX environment. To resolve the issue, move the virtual router VM to a different vDS. 

Additional Information

If you are contacting Broadcom support about this issue, please provide the following:

  • NSX Manager support bundles.
  • ESXi host support bundles for the problematic Hosts.
  • Edge support Budnles from the problematic Edges
  • Text of any error messages seen in NSX GUI or command lines pertinent to the investigation.

Handling Log Bundles for offline review with Broadcom support: