Symptoms:
In NSX-T 3.1.x releases North south connectivity loss for all VMs running under same hostswitch as edge VM running on the ESXI.
VMware NSX-T Data Center
VMware NSX-T Data Center 3.x
North south connectivity loss for all VMs running under same hostswitch as edge VM at esx.
2021-09-20T10:35:14.132Z cpu11:2097929)VDL2GetLAGFirstMemberPortID:328:[nsx@6876 comp="nsx-esx" subcomp="vdl2-18329003"]LAG lag1 first uplink : portName:lag1-0, portID:0x8400000b
2021-09-20T10:35:17.322Z cpu10:2098077)VDL2GetLAGFirstMemberPortID:328:[nsx@6876 comp="nsx-esx" subcomp="vdl2-18329003"]LAG lag1 first uplink : portName:lag1-0, portID:0x0
vmk_LACPGetLAGUplinks to get underlying vmnic under LAG returns nothing which causes failure at VDl2 since it sets uplinkportID the vmknic uses as 0.
Upgrade to NSX-T 3.1.4 and 3.2.x for a permanent solution.
Workaround:
Bring down/up the uplink(vmnic) which is mapped to VTEP(vmk) under the overlay portset(hostswitch).
First find out what physical interfaces are used by the NSX switch.
Then, VTEP(vmkX) to vmnic mapping can be done using esxtop and then press n at the ESXi CLI level:
Finally, bring down/up the aforementioned interfaces with the commands below:
esxcli network nic up -n vmnicX esxcli network nic down -n vmnicX