VMs become unreachable from outside as NSX edge stops advertising BGP routes after manager upgrade from NSX-T 3.1.x to 3.2.x
search cancel

VMs become unreachable from outside as NSX edge stops advertising BGP routes after manager upgrade from NSX-T 3.1.x to 3.2.x

book

Article ID: 386706

calendar_today

Updated On:

Products

VMware NSX-T Data Center

Issue/Introduction

- NSX edge stops advertising BGP routes after manager upgrade from NSX-T 3.1.x to 3.2.x.
- VMs become unreachable from outside NSX as upstream router does not have required routes.
- BGP neighbor output shows 0 under OutPfx.

Neighbor                            AS          State Up/DownTime  BFD InMsgs  OutMsgs InPfx  OutPfx

10.##.##.#42                       648##       Estab 08w3d03h     UP  5079823 5108838 1      0
10.##.##.#58                       648##       Estab 08w3d03h     UP  5079948 5108838 1      0

- There is no route-map configured under 'Set Route Re-distribution'. Issue gets fixed after disabling/re-enabling Route Re-distribution option from NSX manager UI.

Environment

VMware NSX-T Data Center

Cause


- The issue can happen if a route map that was referenced by redistribution was deleted. Following logs may be present -
Entity <ROUTE-MAP-NAME>(RouteMap/c629####-####-####-####-####9a017a0) can not be deleted as it is being referenced by entity(s): 2c86b####-####-####-####-####c3be-rt-redistribution-rule-list(RedistributionRuleList/e940####-####-####-####-####95b9)","source_reference":"/infra/realized-state/enforcement-points/default/tier-0-logical-routers/2c86b####-####-####-####-####c3be/route-maps/<ROUTE-MAP-NAME>"

- This route-map does not delete in MP as it is referenced by a redistribution list or a BGP neighbor, and remains in the system.
- During upgrade manager restarts after upgrade and provider is triggered to attempt deletion of route-map, time it gets deleted due to missing forward relationship.
- Due to missing route map, the config is rejected by edge and redistribution stops.

Resolution

Disabling and re-enabling route redistribution should start redistribution again.

To avoid this issue following steps can be used before upgrade to NSX-T 3.2.x to validate if there are any route-maps in pending deletion state.

1. Check NSX manager UI for any stale route-map in configuration, and any related alarms.
2. Check marked for delete route maps using following API. This needs to be run for each Tier-0 separately. Check for '"markedForDelete": true'.
GET https://<managerIP>/policy/api/v1/infra/tier-0s/<tier-0>/route-maps?include_mark_for_delete_objects=true

3. Now check whether any route map exists with markedForDelete:true. If no route-map exists with delete true flag then there shouldn't be any stale route map.
4. If there are any route-maps with markedForDelete as true, and if it is stuck in the UI, then check where it has been used (route filter, redistribution list etc.).
5. Identify where it is being used, either remove route-map reference from it, or map with new route-map based on filtering requirements.
6. Wait for 5 min (it can take up to 5 min to clear the entry) and check marked for delete again. This needs to be run for each Tier-0 separately.
GET https://<managerIP>/policy/api/v1/infra/tier-0s/<tier-0>/route-maps?include_mark_for_delete_objects=true

If route map still shows pending deletion or in markedForDelete:true state then please open a support case with Broadcom.

Additional Information

Following additional steps may be required if there are stale route-maps post upgrade to NSX-T 3.2
Workaround for Deleting Stuck PrefixList/Tier0/RouteMap Post-Upgrade from 3.1.x to 3.2.1+