NSX-T Local Manager stuck in "Initializing State" after restoring from a backup in a Federation environment
search cancel

NSX-T Local Manager stuck in "Initializing State" after restoring from a backup in a Federation environment

book

Article ID: 322487

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • You recently restored an NSX Local Manager (LM) from a backup.
  • After restoring the LM from the backup, the Global Manger (GM) UI shows the restored LM as disconnected.
  • You may see in  the GM /aph_rest_output.json log file the LM shows disconnected for the ApplianceProxyHub service:
{
 "address": "ssl://10.0.x.x:1236",
 "conn_status": "Disconnected",
 "node_id": "11708907-48a5-4ce0-a8ca-9f452975d2c3",
 "node_type": "ApplianceProxyHub"
 {
  • You may see similar entries in the /var/log/async-replicator/ar.log file found on the GM that the site leader is not reachable.
2023-03-28T20:43:16.714Z WARN NsxRpcStubManager-1 NsxRpcStubManager 5722 SYSTEM [nsx@6876 comp="global-manager" level="WARNING" subcomp="async-replicator"] getLeader call failed for remote site f0f6305d-8357-4929-8246-819aa7aebc08 with aph 11708907-48a5-4ce0-a8ca-9f452975d2c3.
java.lang.Exception: Unable to reach remote site f0f6305d-8357-4929-8246-819aa7aebc08aph 11708907-48a5-4ce0-a8ca-9f452975d2c3
  • The LM id config number seen in the /clustering.json on the GM log file is higher than the config number seen for the same LM id in the /clustering.json log on the LM.
NSXT_GlobalManager1/clustering.json
"config_version": 17,
"id": "f0f6305d-8357-4929-8246-819aa7aebc08",
NSXT_GlobalManager2/clustering.json
"config_version": 17,
"id": "f0f6305d-8357-4929-8246-819aa7aebc08",
NSXT_GlobalManager3/clustering.json
"config_version": 17,
"id": "f0f6305d-8357-4929-8246-819aa7aebc08",
NSXT_Manager_LM1/clustering.json
"config_version": 16,
"id": "f0f6305d-8357-4929-8246-819aa7aebc08",
NSXT_Manager_LM2/clustering.json
"config_version": 16,
"id": "f0f6305d-8357-4929-8246-819aa7aebc08",
NSXT_Manager_LM3/clustering.json
"config_version": 16,
"id": "f0f6305d-8357-4929-8246-819aa7aebc08",
  • Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.


Environment

VMware NSX-T Data Center

Cause

The Local Manger is stuck in "Initializing State" after a restore from backup, due the config version being lower than the Global Manager.

Resolution

This is a known issue impacting NSX-T Data Center.

Workaround:
Run the following API on the impacted local manager until the config version shown on the local manager side is one greater than the config version on the global manager side:
POST https://<local-manager>/api/v1/sites?action=refresh
 
For the example above, if there is a config version difference of 1 then the API will need to be ran twice to bring the LM config version greater than the GM, bringing the config version from sixteen to eighteen.

NSXT_Manager_LM1/clustering.json
"config_version": 16,
NSXT_GlobalManager1/clustering.json
"config_version": 17,