Standby NSX Global manager showing mode "none" instead of "standby"
search cancel

Standby NSX Global manager showing mode "none" instead of "standby"

book

Article ID: 375141

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Standby GM nodes were redeployed, and post redeployment, the new cluster is healthy with get cluster status returning all UP via CLI, and UI shows the cluster as stable
  • In active GM UI, under the location manager tab, the newly added standby GM nodes show the status None whereas we expect to see the status Standby
  • GET https://<Standby_GM_IP/FQDN>/global-manager/api/v1/global-infra/global-managers/<Standby_GM_Display_Name returns NONE for the mode property.
    {
       "federation_id": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
       "site_id": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
        "connection_info": [
            {
               "fqdn": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
                "username": "admin",
               "thumbprint": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
            }
        ],
       "mode": "NONE", ------------------------------------------------------------> Confirming the none status
        "maximum_rtt": 250,
        "fail_if_rtt_exceeded": true,
        "resource_type": "GlobalManager",
       "id": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
       "display_name": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
       "path": "/global-infra/global-managers/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
       "relative_path": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
        "parent_path": "/global-infra",
       "unique_id": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
        "marked_for_delete": false,
        "overridden": false,
        "_create_user": "admin",
       "_create_time": xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
        "_last_modified_user": "system",
       "_last_modified_time": xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
        "_system_owned": false,
        "_protection": "NOT_PROTECTED",
        "_revision": 5
    }
  • Additionally, when checking the Local Manager (LM) UI and via the API /api/v1/sites, it was observed that the LM configuration is still pointing to the old standby GM nodes that were replaced.

 

 

 

Environment

VMware NSX

Cause

The primary cause of the issue was the disconnection of the standby Global Manager from Site Manager, which prevents proper synchronization between the Global Managers and leads to an incomplete off-boarding process.

Resolution

Execute the off-boarding API of the Standby Global Manager node using the following command from the Active Global Manager CLI as root:


curl -X POST -ik http://localhost:7441/api/v1/sites?action=offboard_remote \ -H "Content-Type: application/json" \ -d '{"credential": {"ip": "", "port": 443, "username": "", "password": "", "thumbprint": ""}, "site_id": "#####################################"}'


The site_id can be retrieved by running a GET request: GET https://<Active NSX Global Manager FQDN or IP>/api/v1/sites?version=latest, and selecting the site that needs to be off-boarded.

Once the off-boarding API is executed, the stale standby Global Manager entry will automatically be removed from the UI. Afterward, delete the GM in "NONE" mode via the UI and re-add it. This will correctly register it with the "Standby" status.