One or more transport nodes appear in a Degraded state when viewed in the NSX UI under Fabric > Hosts. BFD Tunnel status flaps intermittently across multiple ESXi hosts in the cluster.
System logs (vmkernel.log) show events indicating BFD sessions flapping beyond threshold limits and dropping:
2025-10-23T00:36:18.766Z In(182) vmkernel: cpu53:2098407)BFD_HandleStatusChange:873:[nsx@6876 comp="nsx-esx" subcomp="bfd"]local: 10.##.##.67, remote: 10.##.##.38 is flapping beyond 10 times in 5 minutes
2025-10-23T00:36:18.808Z In(182) vmkernel: cpu48:2098407)BFD_HandleStatusChange:873:[nsx@6876 comp="nsx-esx" subcomp="bfd"]local: 10.##.##.67, remote: 10.##.##.166 is flapping beyond 10 times in 5 minutes
2025-10-23T16:49:01.736Z In(182) vmkernel: cpu48:2098407)BFD_HandleStatusChange:851:[nsx@6876 comp="nsx-esx" subcomp="bfd"]local: 10.##.##.67, remote: 10.##.##.38, oldState: up, newState: down, diag: Neighbor Signaled Session Down, type: overlay
2025-10-23T16:49:27.437Z In(182) vmkernel: cpu62:2098407)BFD_HandleStatusChange:851:[nsx@6876 comp="nsx-esx" subcomp="bfd"]local: 10.##.##.67, remote: 10.##.##.166, oldState: up, newState: down, diag: Neighbor Signaled Session Down, type: overlayNote: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
VMware NSX
The issue is caused by a Layer-1 physical network fault resulting in significant 'Receive CRC errors' on the ESXi host uplink (e.g., vmnic0). The Cyclic Redundancy Check (CRC) mismatch indicates packet corruption at the physical layer, prompting the destination host to discard the packets. This packet loss causes the BFD probes to drop, forcing the peer node to bring the tunnel session down.
To resolve this issue, please follow the steps below:
esxcli network nic stats get -n vmnic0esxcli network nic stats get -n vmnic0
For more information, please refer to the Knowledge Base (KB) article linked below.
Troubleshooting NIC errors and other network traffic faults in ESXi