A BGP neighbor adjacency flaps or goes down unexpectedly on a specific NSX Edge node within a Tier-0 Gateway.
NSX Alarms: "BGP Neighbor Down" alerts are generated for a specific Edge Transport Node.
In a multi-node Tier-0 cluster, the alarm may be isolated to a single Edge node while others remain stable.
The session moves from Established to Clearing after receiving a hold timer expired notification from the neighbor.
Log lines similar to the below are encountered in the Edge frr.log:
2026/04/29 21:15:06.940568 BGP: 10.X.X.X [Event] BGP connection closed fd 32
2026/04/29 21:15:06.940701 BGP: 10.X.X.X BGP connection fd 32, recieved fatal error, wipe off ringbuf
2026/04/29 21:15:06.940778 BGP: %NOTIFICATION: received from neighbor 10.X.X.X 4/0 (Hold Timer Expired) 0 bytes
2026/04/29 21:15:06.940812 BGP: 10.X.X.X [FSM] Receive_NOTIFICATION_message (Established->Clearing), fd 32
2026/04/29 21:15:06.940833 BGP: %ADJCHANGE: neighbor 10.X.X.X(Unknown) in vrf default Down BGP Notification received
2026/04/29 21:15:06.941369 BGP: 10.X.X.X: peer keepalive being removed, acquiring lock
2026/04/29 21:15:06.941390 BGP: 10.X.X.X: peer keepalive removed
2026/04/29 21:15:06.941501 BGP: 10.X.X.X(0x1d51d676bff0): close file descriptor
2026/04/29 21:15:06.941515 BGP: bgp_fsm_change_status : vrf default(0), Status: Clearing established_peers 0
2026/04/29 21:15:06.942867 BGP: 10.X.X.X (0x1d51d676bff0 -1) went from Established to Clearing
2026/04/29 21:15:06.942878 BGP: Peer 10.X.X.X fd -1 send BGP_DOWN message to BGP adapter
2026/04/29 21:15:06.942981 BGP: BGP Adapter: peer 10.X.X.X (vrf: default) send update, old UP, new DOWN stateNote: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
Evidence from the Edge logs (var/log/debug/routing/bgp-flap****) , which are available from NSX version 4.2.x confirms that the NSX Edge node was successfully generating and transmitting Keepalives every 30 seconds leading up to the flap:
Down at: 2026/04/29 21:15:06
Session Reset Due To: BGP Notification received
Down events:
2026/04/29 21:15:06: Receive_NOTIFICATION_message [Established -> Clearing]
2026/04/29 21:13:25: ConnectRetry_timer_expired [OpenConfirm -> Established]
2026/04/29 21:11:46: BGP_Start [Idle -> Connect]
2026/04/29 21:11:45: Clearing_Completed [Clearing -> Idle]
2026/04/29 21:12:28: Receive_OPEN_message [OpenSent -> OpenConfirm]
2026/04/29 21:12:28: TCP_connection_open [Active -> OpenSent]
2026/04/29 21:12:28: New_bgp_connection [Idle -> Active]
Keepalives sent before flap:
2026/04/29 21:14:55
2026/04/29 21:14:25
2026/04/29 21:13:55
2026/04/29 21:13:25
2026/04/29 21:12:55
(Keepalives being sent every 30 seconds)
VMware NSX
The BGP session is torn down because the external neighbor's hold timer expired.
This indicates that the neighbor did not receive or process BGP Keepalive packets from the NSX Edge within the configured hold-down interval.
Evidence from the Edge logs confirms that the NSX Edge node was successfully generating and transmitting Keepalives every 30 seconds leading up to the flap.
Despite these packets being sent by the Edge, the neighbor reports a timeout.
This demonstrates that the packets were either dropped in the physical underlay or the neighbor was unable to process them.
If you believe you have encountered this issue, please engage your physical network team to review the underlay infrastructure for potential packet drops or resource constraints on the physical neighbor.
If you are contacting Broadcom support about this issue, please provide the following:
Handling Log Bundles for offline review with Broadcom support: