When a prefix has multiple-paths from only one single peer (a route-server) that supports ADPPATH-TX, and then withdraws one of the paths, other paths remain in BGP table. But the route is deleted from RIB(zebra).
The remaining path/prefix will be available on the BGP table but not installed on the routing table
Essentially, all paths get deleted from the FIB instead of the single specific path, causing possible traffic disruption.
NSX Edge syslog will have logs similar to below.
/var/log/syslog
2024-04-10T11:35:58.698Z nsxedge.##.local NSX 7 ROUTING [nsx@6876 comp="nsx-edge" subcomp="rcpm" s2comp="rcpm-nsxa" level="INFO"] Received prefix 10.##.##.208/32 with n_nexthops = 1 in table_id = 254 action = ADD2024-04-10T11:37:02.773Z nsxedge.##.local NSX 7 ROUTING [nsx@6876 comp="nsx-edge" subcomp="rcpm" s2comp="rcpm-nsxa" level="INFO"] Received prefix 10.##.##.208/32 with n_nexthops = 2 in table_id = 254 action = ADD2024-04-10T11:38:46.309Z nsxedge.##.local NSX 7 ROUTING [nsx@6876 comp="nsx-edge" subcomp="rcpm" s2comp="rcpm-nsxa" level="INFO"] Received prefix 10.##.##.208/32 with n_nexthops = 0 in table_id = 254 action = DELETE
n_nexthops = 0 -> this is not valid as the route server is still advertising the prefix with one nexthop
2024-04-10T11:38:55.872Z nsxedge.##.local NSX 7 ROUTING [nsx@6876 comp="nsx-edge" subcomp="rcpm" s2comp="rcpm-nsxa" level="INFO"] Received prefix 10.##.##.208/32 with n_nexthops = 1 in table_id = 254 action = ADD2024-04-10T11:38:57.618Z nsxedge.##.local NSX 7 ROUTING [nsx@6876 comp="nsx-edge" subcomp="rcpm" s2comp="rcpm-nsxa" level="INFO"] Received prefix 10.##.##.208/32 with n_nexthops = 2 in table_id = 254 action = ADD2024-04-10T11:39:42.612Z nsxedge.##.local NSX 7 ROUTING [nsx@6876 comp="nsx-edge" subcomp="rcpm" s2comp="rcpm-nsxa" level="INFO"] Received prefix 10.##.##.208/32 with n_nexthops = 0 in table_id = 254 action = DELETE2024-04-10T11:41:39.624Z nsxedge.##.local NSX 7 ROUTING [nsx@6876 comp="nsx-edge" subcomp="rcpm" s2comp="rcpm-nsxa" level="INFO"] Received prefix 10.##.##.208/32 with n_nexthops = 1 in table_id = 254 action = ADD
VMware NSX
A BGP path-computation optimization was implemented in 4.1 release. As part of that, an optimization was added for the case where a prefix path is being removed and it does not have any other paths available.
In such case, best-path computation for the prefix was skipped and the route from RIB was removed. The scenario that there can be paths from the same peer if it supports ADDPATH-TX was not considered. In such case, a best-path compute still need to be scheduled so that the entire route will not be deleted and only the corresponding next hop will be removed.
The issue is resolved in NSX 4.2 and later releases.
Workaround
Reset BGP to repopulate the routes.
Command : From T0_SR VRF execute nsx-t-edge(tier0_sr)> clear bgp <ip-address>