In an NSX-T Federated environment, an NSX-T Edge Node has been powered off either ungracefully through the vCenter UI or due to a crash.
Consequently, stretched NSX segment traffic is routed to the offline edge due to the host showing the vTEP state as active instead of removing it. As a result, traffic disruption occurs on the NSX-T Edge Nodes.
To troubleshoot this issue, execute the following command "get logical-switch [logical switch UUID] vtep-group
" on the NSX-T Edge Node as an admin user after powering it off to observe a similar output below.
Wed May 10 2023 UTC 19:12:37.973
VTEP Group Label: 45057
Type: Gateway
HA Type: Active/Standby
Activeness Proto: Activeness Notification
HA State Sync (ms): 32312
Active Mbr: 1
Label: 46081
VTEP IP: 192.168.1.151
VTEP MAC: 0a:00:08:2f:55:a6
State: 1 <==== Active
BFD Count: 0
Label: 109569
VTEP IP: 192.168.1.152
VTEP MAC: 0a:00:08:37:10:ae
State: 1 <==== Active
BFD Count: 0
VMware NSX-T Data Center
VMware NSX
This issue is resolved in NSX 4.1.2 or higher.
Workaround:
Powering on the NSX-T Edge Node that was ungracefully powered off will resume normal traffic flow.