VMware NSX
This is expected behavior.
When two NSX Edge Nodes are configured in Active/Standby, both Edges will establish and maintain peering and route updates with upstream BGP Peers. During the initial Fail Over, traffic may experience a brief (a few seconds or less) connectivity outage as traffic moves from the Preferred Edge to the Non-Preferred Edge.
During Fail Back, in order for traffic to properly move from the Non-Preferred Edge to the Preferred Edge, the Non-Preferred Edge will drop BGP peering for 30 seconds. This may incur another brief outage as traffic re-establishes connections via the Preferred Edge Node.
This can be confirmed via the NSX Edge CLI on the Non-Preferred Edge at the moment of Fail Back.
Non-Preferred (Standby) Edge is Up and BGP is established before Fail Over:
edge02(tier0_sr[2])> get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
192.###.###.254 64800 Estab 02:10:43 NC 1418 1339 17 2
192.###.###.254 64800 Estab 02:10:43 NC 1417 1334 17 2
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv6Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
fd00:#:#:#::#:84fe 64800 Estab 02:10:43 NC 7947 7861 16 1
fd00:#:#:#::#:85fe 64800 Estab 02:10:43 NC 7944 7859 16 1
Thu Mar 27 2025 UTC 19:23:46.617
Confirm Non-Preferred Edge is not active:
edge02(tier0_sr[2])> get high-availability status
Thu Mar 27 2025 UTC 19:24:23.779
Service Router
UUID : 83cc####-####-####-####-#######11675
state : Standby
←Current State of Edge is Standby, waiting to take over if necessary.type : TIER0
mode : A/S
failover mode : Preemptive
rank : 1
service count : 0
service score : 0
HA ports state
UUID : dc72####-####-####-####-#######c3ae4
op_state : Down
←This Edge is in a Down State - Failover has NOT occurred. addresses : 169.###.###.2/24;fe80:#:#:#:#:5300/64
Peer Routers
SR UUID : fa47####-####-####-####-#######d56ac
Node UUID : f0ae####-####-####-####-#######d4639
HA state : Active
←Preferred Edge is Up and online.
Fail Over has occurred, Non-Preferred Edge is now Active:
edge02(tier0_sr[2])> get high-availability status
Thu Mar 27 2025 UTC 19:26:06.061
Service Router
UUID : 83cc####-####-####-####-#######11675
state : Active
←Current State of Edge is Active, it has taken over traffic.type : TIER0
mode : A/S
failover mode : Preemptive
rank : 1
service count : 0
service score : 0
HA ports state
UUID : dc72####-####-####-####-#######c3ae4
op_state : Up
←This Edge is in an Up State - Failover has occurred. addresses : 169.###.###.2/24;fe80:#:#:#:#:5300/64
Peer Routers
SR UUID : fa47####-####-####-####-#######d56ac
Node UUID : f0ae####-####-####-####-#######d4639
HA state : Unreachable
←Preferred Edge is unreachable, justifying failover actions.
Non-Preferred Edge has become active, and has received NO change in BGP Peering. Any outage would have been short-lived. Peering has been established for over two hours.
edge02(tier0_sr[2])> get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
192.###.###.254 64800 Estab 02:13:24 NC 1421 1343 17 2
192.###.###.254 64800 Estab 02:13:24 NC 1420 1338 17 2
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv6Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
fd00:#:#:#::#:84fe 64800 Estab 02:13:24 NC 7965 7878 16 1
fd00:#:#:#::#:85fe 64800 Estab 02:13:24 NC 7962 7876 16 1
Thu Mar 27 2025 UTC 19:26:27.592
When the preferred Edge is returning to functionality, the Fail Back will break BGP peering for 30 seconds (as designed):
Before Fail Back:
edge02(tier0_sr[2])> get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
192.###.###.254 64800 Estab 02:18:01 NC 1428 1349 17 2
192.###.###.254 64800 Estab 02:18:01 NC 1427 1344 17 2
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv6Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
fd00:#:#:#::#:84fe 64800 Estab 02:18:01 NC 7994 7907 16 1
fd00:#:#:#::#:85fe 64800 Estab 02:18:01 NC 7991 7905 16 1
Thu Mar 27 2025 UTC 19:31:05.075
Fail Back has begun. Non-Preferred Edge BGP Peering is down and will remain down for 30 seconds:
edge02(tier0_sr[2])> get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
192.###.###.254 64800 Idle 00:00:00 NC 1428 1351 0 0
192.###.###.254 64800 Idle 00:00:00 NC 1427 1346 0 0
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv6Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
fd00:#:#:#::#:84fe 64800 Idle 00:00:00 NC 7994 7909 0 0
fd00:#:#:#::#:85fe 64800 Idle 00:00:00 NC 7991 7907 0 0
Thu Mar 27 2025 UTC 19:31:06.068
edge02(tier0_sr[2])> get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
192.###.###.254 64800 Idle 00:00:29 NC 1428 1351 0 0
192.###.###.254 64800 Idle 00:00:29 NC 1427 1346 0 0
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv6Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
fd00:#:#:#::#:84fe 64800 Idle 00:00:29 NC 7994 7909 0 0
fd00:#:#:#::#:85fe 64800 Idle 00:00:29 NC 7991 7907 0 0
Thu Mar 27 2025 UTC 19:31:35.179
After this 30-second period, traffic has already failed back to the preferred Edge with minimal to no outage, and BGP is re-established on the Non-Preferred Edge Node.
edge02(tier0_sr[2])> get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
192.###.###.254 64800 Estab 00:00:02 NC 1430 1353 0 0
192.###.###.254 64800 Estab 00:00:02 NC 1429 1348 0 0
BFD States: NC - Not configured, DC - Disconnected
DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv6Unicast
Router ID: 192.###.###.2 Local AS: 65000
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
fd00:#:#:#::#:84fe 64800 Estab 00:00:01 NC 7996 7911 0 0
fd00:#:#:#::#:85fe 64800 Estab 00:00:01 NC 7993 7909 0 0
Thu Mar 27 2025 UTC 19:31:37.072
Non-Preferred Edge has returned to original High Availability State:
edge02(tier0_sr[2])> get high-availability status
Thu Mar 27 2025 UTC 20:07:20.064
Service Router
UUID : 83cc####-####-####-####-#######11675
state : Standby
←Non-Preferred Edge has returned to Standby state, waiting to take over if necessary.type : TIER0
mode : A/S
failover mode : Preemptive
rank : 1
service count : 0
service score : 0
HA ports state
UUID : dc72####-####-####-####-#######c3ae4
op_state : Down
addresses : 169.###.###.2/24;fe80:#:#:#:#:5300/64
Peer Routers
SR UUID : fa47####-####-####-####-#######d56ac
Node UUID : f0ae####-####-####-####-#######d4639
HA state : Active
←Preferred Edge is once again Up and online.