BGP neighbors fail to establish due to IPv4/IPv6 addresses missing on NSX-T edge node interfaces
search cancel

BGP neighbors fail to establish due to IPv4/IPv6 addresses missing on NSX-T edge node interfaces

book

Article ID: 322523

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

If an NSX edge node has been recently rebooted:

  • IPv4/IPv6 addresses are missing on edge node interfaces.
  • IPv4/IPv6 BGP neighbor relationships will not get established.
  • In the Edge node log /var/log/frr/frr.log you see entries similar to the following:

    BGP: [EC 33554465] <IPv4-address-1> [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Start, BGP_Start, fd -1
    BGP: [Event] Incoming BGP connection rejected from <IPv4-address-1> since it is not directly connected and TTL is 1

  • In the Edge node log /var/log/syslog we see the following:
nsx-edge-1 bgpd 22005 - - [EC 33554465] <IPv6-address-1> [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Start, BGP_Start, fd -1
nsx-edge-1 bgpd 22005 - - [EC 33554465] <IPv6-address-2> [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Start, BGP_Start, fd -1
 
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware NSX

Cause

Routing controller reads interface kernel notifications via netlink socket. 

During edge reboot, netlink notifications might get dropped/missed and this leads to missed configuration on the interfaces in comparison to the kernel.

Resolution

This is a known issue impacting VMware NSX.
In a future release on VMware NSX, as part of the automatic remediation process, the rescanning of interfaces will be implicitly triggered twice by the system.

Workaround:

To recover the missing IPv4/IPv6 addresses, the interfaces can be rescanned using the below command in the edge node CLI as admin user:

Edge> get logical-routers
Edge> vrf <vrf_id of SERVICE_ROUTER_TIER0>
Edge(tier0_sr)> set debug
Edge(tier0_sr)> start rescan interfaces
Edge(tier0_sr)> exit
 
NOTE: If VRF-Lite Tier0's are experiencing this issue, the above commands need to be ran on the PARENT Tier0 VRF (commands are not available on VRF-Lite Tier0s).
 
NOTE: For VMware NSX versions prior to 4.1.2, please do not use the workaround on EVPN enabled setups.