BFD tunnel between edge and host went down using LAG port after Vmotion of the edge vm to prepared host with NSX-T
search cancel

BFD tunnel between edge and host went down using LAG port after Vmotion of the edge vm to prepared host with NSX-T

book

Article ID: 338751

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
In NSX-T 3.1.x releases North south connectivity loss for all VMs running under same hostswitch as edge VM running on the esx.


Environment

VMware NSX-T Data Center
VMware NSX-T Data Center 3.x

Cause

North south connectivity loss for all VMs running under same hostswitch as edge VM at esx.

2021-09-20T10:35:14.132Z cpu11:2097929)VDL2GetLAGFirstMemberPortID:328:[nsx@6876 comp="nsx-esx" subcomp="vdl2-18329003"]LAG lag1 first uplink : portName:lag1-0, portID:0x8400000b

2021-09-20T10:35:17.322Z cpu10:2098077)VDL2GetLAGFirstMemberPortID:328:[nsx@6876 comp="nsx-esx" subcomp="vdl2-18329003"]LAG lag1 first uplink : portName:lag1-0, portID:0x0

vmk_LACPGetLAGUplinks
to get underlying vmnic under LAG returns nothing which causes failure at VDl2 since it sets uplinkportID the vmknic uses as 0.

Resolution

Upgrade to NSX-T 3.1.4 and 3.2.x for a permanent solution.

Workaround:
Bring down/up the uplink(vmnic) which is mapped to VTEP(vmk) under the overlay portset(hostswitch).

First you find out what physical interfaces are used by the NSX switch.

image.png

Then, VTEP(vmkX) to vmnic mapping can be done using esxtop and then press n at the ESXi CLI level:

image.png

Finally, bring down/up the aforementioned interfaces with the commands below:
esxcli network nic up -n vmnicX
esxcli network nic down -n vmnicX


Additional Information

https://www.yellow-bricks.com/esxtop/

Impact/Risks:
No North-South L3 connectivity for overlay workload VMs placed on the same hosts as the Edge VMs.