BFD tunnel down between transport nodes due to NSX T assigned duplicate TEP IP
search cancel

BFD tunnel down between transport nodes due to NSX T assigned duplicate TEP IP

book

Article ID: 380147

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

BFD tunnel down between transport nodes due to NSX T assigned duplicate TEP IP for managed transport nodes. The duplicate TEP IPs cause disconnection for the NSX managed VMs.

NSX T UI shows degraded state for the transport node.

Environment

VMware NSX T 3.2.2 

VMware NSX 4.1.0

Cause

On a managed transport node (host), TEP-TEP connectivity test finishes without failure

#vmkping -I vmk## -S vxlan -d -s 1572 <destination TEP IP>
Test network connectivity between two TEP endpoints from the ESXi host
vmkping = command
-I vmk## = choose with VMK interface to ping from (-eye, not -ell)
-S vxlan = chose vxlan / geneve overlay network stack
-d = mark the do not fragment bit
-s 1572 = set the payload packet size to 1572 bytes (maximum allowed on a 1600 MTU network)

The output of following ESXi CLI command shows BFD tunnel status DOWN

 #nsxdp-cli bfd sessions list

This is a known issue where an IP from an NSX-T TEP IP pool is allocated to more than one transport node.  This is generally seen when an IP has been released incorrectly, causing it to be seen as available for reuse, when it isn't actually.  

Review log entries from nsxapi.log, looking for messages similar to the below showing the same IP being allocated to multiple TN's:
nsxapi.log INFO L2HostConfigTaskExecutor3 VtepPopulator xxx FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Allocated IP [###.###.123.321] for HostSwitch ##### in TN ######-a3bn-lk62-####-############ 
nsxapi.log INFO L2HostConfigTaskExecutor3 VtepPopulator xxx FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Allocated IP [###.###.123.321] for HostSwitch ##### in TN ######-c18n-78xs-####-############ 

 

 

Resolution

Upgrade NSX T to 3.2.3 

Upgrade NSX to 4.1.1.0