In an NSX-T Federated environment, an NSX-T Edge Node has been powered off either ungracefully through the vCenter UI or due to a crash.
Consequently, stretched NSX segment traffic is routed to the offline edge due to the host showing the vTEP state as active instead of removing it. As a result, traffic disruption occurs on the NSX-T Edge Nodes.
To troubleshoot this issue, execute the following command "get logical-switch [logical switch UUID] vtep-group" on the NSX-T Edge Node as an admin user after powering it off to observe a similar output below.
Wed May 10 2023 UTC 19:12:37.973VTEP Group Label: 45057Type: GatewayHA Type: Active/StandbyActiveness Proto: Activeness NotificationHA State Sync (ms): 32312Active Mbr: 1 Label: 46081 VTEP IP: 192.168.1.151 VTEP MAC: 0a:00:08:2f:55:a6 State: 1 <==== Active BFD Count: 0 Label: 109569 VTEP IP: 192.168.1.152 VTEP MAC: 0a:00:08:37:10:ae State: 1 <==== Active BFD Count: 0VMware NSX-T Data Center
VMware NSX
This issue is resolved in NSX 4.1.2 or higher.
Workaround:
Powering on the NSX-T Edge Node that was ungracefully powered off will resume normal traffic flow.