Load balance/rebalance existing T1 gateways in the edge cluster in NSX environment
book
Article ID: 393340
calendar_today
Updated On:
Products
VMware NSX
Issue/Introduction
In an NSX environment, an edge cluster contains "N" number of edge nodes with T1 gateways present on those edge nodes. If additional edge nodes are later added to the cluster, any new T1 gateways will be placed on the new nodes. However, the existing T1 gateways will not automatically rebalance to the newly added nodes. To manually trigger a failover of a specific T1 gateway to another edge node, you can use the GUI or NSX API.
Environment
VMware NSX
Resolution
To rebalance T1's you can toggle Auto Allocate Edges to run the algorithm.
Log into the NSX Manager GUI
Select Networking > Tier-1 Gateways
Select Edit when clicking the ellipsis under the Tier-1 you want to change
Toggle Auto Allocate Edges to Off and select an Edge.
Click Save
Repeat Step 3 and Edit the Tier1
Toggle Auto Allocate Edges to Yes
Click Save, the Tier1 will now balance to the lowest utilized Edges(Up to 2)
Or you can use NSX API to manually allocate to specific edges
Execute below state API to retrieve existing allocated edge nodes for any T1 gateway. In response you will see two allocated node with current HA status. Please check edge_path and high_availability_status attributes GET https://<nsx-mgr>/policy/api/v1/infra/tier-1s/<T1-id>/state
API to get locale-services/ and manually pass two preferred edge paths
GET https://<nsx-mgr>/policy/api/v1/infra/tier-1s/<T1-id>/locale-services/<locale-services-ID> { "results": [ { "edge_cluster_path": "/infra/sites/default/enforcement-points/default/edge-clusters/<Edge-Cluster-ID>", "resource_type": "LocaleServices", "id": "default", "display_name": "default", "path": "/infra/tier-1s/<T1-id>/locale-services/<locale-services-ID>", <<<<<< Note down this locale-services ID path for tier-1 which is needed for Action B "relative_path": "default", "parent_path": "/infra/tier-1s/<T1-id>", "remote_path": "", "unique_id": "########-####-####-####-########f13c", "realization_id": "########-####-####-####-########f13c", "owner_id": "########-####-####-####-########fc9f", "marked_for_delete": false, "overridden": false, "_system_owned": false, "_protection": "NOT_PROTECTED", "_create_time": 1757950351990, "_create_user": "admin", "_last_modified_time": 1757950351990, "_last_modified_user": "<user>", "_revision": 0 } ], "result_count": 1, "sort_by": "display_name", "sort_ascending": true }
Now execute PUT API on tier1 gateway locale service and manually pass two preferred edge paths which you received from first API. First pass active node path and second standby node path. This operation will not trigger any disruption as eventually we are assigning same two edge nodes and also in same order of ACTIVE & STANDBY. Eventually your T1 will be updated as manual allocation. (Note: In case you are getting principal identity issue because of user error, please pass X-Allow-Overwrite=true in HEADER while executing PUT APIs)
Now execute same above API's again and clear preferred_edge_nodes from payload. It will trigger algorithm again and will find least allocated nodes and in this case it will obviously go on new edge nodes which you have added in cluster. Note: Remember, it will be disruptive operation and you will see datapath impact
GET https://<nsx-mgr>/policy/api/v1/infra/tier-1s/<T1-id>/locale-services/<locale-services-ID>
From the above output, we need to remove the following "preferred_edge_paths": [ "/infra/sites/default/enforcement-points/default/edge-clusters/<Edge-Cluster-ID>/edge-nodes/1", "/infra/sites/default/enforcement-points/default/edge-clusters/<Edge-Cluster-ID>/edge-nodes/0" ],
Then use the rest of the output as the body in the next PUT API call.
PUT https://<nsx-mgr>/policy/api/v1/infra/tier-1s/<T1-id>/locale-services/<id>