2022-12-15T22:43:37.236Z hostname NSX 3142 MONITORING [nsx@6876 alarmId="11cb4447-7e32-41a6-980d-eb1dac7039fc" alarmState="OPEN" comp="global-manager" entId="535f2ecd-a292-44b9-8ade-f3f4666336d4" errorCode="MP701099" eventFeatureName="federation" eventSev="CRITICAL" eventState="On" eventType="gm_to_gm_split_brain" level="FATAL" nodeId="19a10f42-20e0-836d-e360-71f6fa6b1838" subcomp="monitoring"] Multiple Global Manager nodes are active: 425f2ecd-a292-44b9-8ade-f3f4666336d4,9e0d4226-8612-4d80-894f-7b80a3e3935d. Only one Global Manager node must be active at any time.
Environment
VMware NSX-T Data Center 3.x VMware NSX 4.0.0.1 VMware NSX-T Data Center
Cause
The condition of a split brain occurs when 2 Global Managers believe they are active and have the same epoch. In this case this occurs due to a race condition handling site configuration updates.
Resolution
This issue is resolved in NSX 3.2.3 available from the VMware Customer Connect portal.
Workaround: GM Site 1 GM Site 2
First determine the current state on both GMs.
In this example we have verified that site1 should be ACTIVE and the following proccedure is used to reset the state of site2
1) remove extra resource (not doing anything from site2) on site2 GM: DELETE https://site2/global-manager/api/v1/global-infra/global-managers/site1
2) site2 is changed from ACTIVE to STANDBY using internal API (and do NOT change any field name as it is intentional to send the request exactly in this manner: ssh as root user to site2 GM (This API is internal and must be run directly on the GM: curl -X POST -ik http://localhost:7441/api/v1/sites?action=set_global_manager -H "Content-Type: application/json" -d '{"status":"STANDBY","force":false,"federation_id":"","gm_name":""}'
If this does not work the force option can be tried curl -X POST -ik http://localhost:7441/api/v1/sites?action=set_global_manager -H "Content-Type: application/json" -d '{"status":"STANDBY","force":true,"federation_id":"","gm_name":""}'
3) On site1 Active Site, from the UI onboard the site2 GM to STANDBY