If an NSX edge node has been recently rebooted or upgraded:
/var/log/frr/frr.log you see entries similar to the following:BGP: [EC #########] <IPv4-address-1> [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Start, BGP_Start, fd -1BGP: [Event] Incoming BGP connection rejected from <IPv4-address-1> since it is not directly connected and TTL is 1/var/log/syslog we see the following:nsx-edge-1 bgpd 22005 - - [EC #########] <IPv6-address-1> [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Start, BGP_Start, fd -1nsx-edge-1 bgpd 22005 - - [EC #########] <IPv6-address-2> [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Start, BGP_Start, fd -1VMware NSX
Routing controller reads interface kernel notifications via netlink socket.
During edge reboot, netlink notifications might get dropped/missed and this leads to missed configuration on the interfaces in comparison to the kernel.
This is a known issue impacting VMware NSX.
In a future release on VMware NSX, as part of the automatic remediation process, the rescanning of interfaces will be implicitly triggered twice by the system.
Workaround:
Edge> get logical-routersEdge> vrf <vrf_id of SERVICE_ROUTER_TIER0>Edge(tier0_sr)> set debugEdge(tier0_sr)> start rescan interfacesEdge(tier0_sr)> exit