Troubleshooting NSX TEP/BFD Tunnels between ESXi hosts and Edges

Products

VMware NSX VMware vSphere ESXi

Issue/Introduction

When troubleshooting BFD tunnels between NSX components (hosts and Edges) a specific set of data must be gathered at the time of the event. This article details what documentation is required and how to gather it prior to opening a support request with Broadcom.

NSX Uses the TEP Tunnels for several very important reasons:

East/West communication between VMs on overlay networks and different hosts
Access to Edges (VM and Bare Metal) for North/South services via Service Routers
Edge High Availability - See Troubleshooting NSX Edge High Availability for more information
Verify health of BFD tunnels from various components - View Bidirectional Forwarding Detection Status

Documentation on how TEP Tunnels work can be found at the following links:

NSX Reference Design Guide - Section 3.2 Logical Switching
NSX-T Edge TEP networking options

Environment

VMware NSX

VMware vSphere ESXi

Resolution

Log Locations and Keywords:

NSX Edge
- HA tunnel
- (geneve) state updated from
- Total tunnels:
- Process DP BFD state update
- /var/log/syslog*
- Relevant Edge Log Keywords

NSX Prepared ESXi host
- diag: Control
- /var/log/vmkernel*
- Relevant Host Log Keywords

CLI Commands

ESXi hosts
- nsxdp-cli bfd sessions list
  - List all TEP tunnels from the TEPS of this host to all other hosts and edges. ESXi hosts with no NSX workload using the overlay will not have active tunnels. Flaps are indications of network instability or example of when a TEP loses connectivity for legitimate reasons (power up/down, maintenance mode, etc.). Flaps column will have at least a 1 in a solid environment with all endpoints up.
- vmkping -I vmk## -S vxlan -d -s 1572 <destination TEP IP>
  - Test network connectivity between two TEP endpoints from the ESXi host
    - vmkping = command
    - -I vmk## = choose with VMK interface to ping from (-eye, not -ell)
    - -S vxlan = chose vxlan / geneve overlay network stack
    - -d = mark the do not fragment bit
    - -s 1572 = set the payload packet size to 1572 bytes (maximum allowed on a 1600 MTU network)
- esxcli network firewall set -e 0
  - Temporarily disable ESXi host's internal firewall to ensure there are no rules that may drop BFD traffic. Observe state of the tunnels on the host.
    Once ESXi firewall has been ruled out, please re-enable and validate status of the firewall:
    esxcli network firewall set -e 1
    esxcli network firewall get

NSX Edges
- get bfd-sessions
  - Same as nsxdp-cli bfd sessions list on hosts
- get bfd-sessions stats
  - Statistics regarding packets dropped and their reasons for each TEP tunnel.

Known configuration issues that can affect TEP tunnels

Known Issues with NSX BFD TEP Tunnels

Helpful Information regarding TEP Tunnel configurations and requirements

Additional Information

Log Line Analysis:

Edge /var/log/syslog*

142585:2024-##-##T##:##:##.###Z <Edge-VM-Name01> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="ha-cluster" level="INFO"] HA tunnel 192.###.###.35:192.###.###.39 state changed from Up to Unreachable
- The remote endpoint is not sending BFD information to the local endpoint due to
- The environment is busy and BFD packets are getting delayed or dropped between TEP endpoints

These tunnel endpoints have experienced a BFD connectivity timeout. The tunnel has gone down because (non-exhaustive list of examples):

142622:2024-##-##T##:##:##.###Z <Edge-VM-Name01> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="ha-cluster" level="INFO"] HA tunnel 192.###.###.35:192.###.###.39 state changed from Unreachable to Up

These tunnel endpoints have begun receiving BFD information again. The tunnel is returning to functional status.

NOTE: Seeing these two log lines in close proximity frequently between the same two endpoints is an indication of network flapping or high latency at one endpoint or a point in between.

ESXi host /var/log/vmkernel*

2024-##-##T##:##:##.###Z cpu59:2098707)BFD_HandleStatusChange:709:[nsx@6876 comp="nsx-esx" subcomp="bfd"]local: 192.###.###.34, remote: 192.###.###.23, oldState: up, newState: down, diag: Control Detection Time Expired, type: overlay
- Log line detailing a tunnel is down between the two IP addresses listed
2024-##-##T##:##:##.###Z cpu36:2098706)BFD_HandleStatusChange:709:[nsx@6876 comp="nsx-esx" subcomp="bfd"]local: 192.###.###.34, remote: 192.###.###.23, oldState: down, newState: init, diag: Control Detection Time Expired, type: overlay
- Log line detailing a tunnel is coming up/connectivity has been restored between the two IP addresses listed
2024-##-##T##:##:##.###Z cpu36:2098706)BFD_HandleStatusChange:709:[nsx@6876 comp="nsx-esx" subcomp="bfd"]local: 192.###.###.34, remote: 192.###.###.23, oldState: down, newState: up, diag: No Diagnostic, type: overlay
- Log line detailing a tunnel is fully up and capable of processing TEP traffic

If you are contacting Broadcom Support about this issue, please provide the following:

Retrieve log bundles from all NSX Edges and all NSX prepared ESXi hosts with TEP/BFD Tunnels reporting down
Retrieve log bundles from all NSX Managers

Handling Log Bundles for offline review with Broadcom support