NSX Global Manager Upgrade Fails with REPO_SYNC FAILED and missing gm_compatibility.versions
search cancel

NSX Global Manager Upgrade Fails with REPO_SYNC FAILED and missing gm_compatibility.versions

book

Article ID: 432847

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

During an upgrade of VMware NSX Global Managers to version 9.0.2.0.0.25150386, the upgrade prechecks or the upgrade process fails. Symptoms:

  • Upgrade status for one or more Global Managers displays: REPO_SYNC = FAILED.

  • Error logs/UI message: Unable to connect to File /repository/9.0.2.0.0.25150386/UC/gm_compatibility.versions. Please verify that file exists on source and install-upgrade service is up.

Environment

 

  • Product: VMware NSX (Global Managers)

  • Versions: 4.x, 5.x, 9.x

 

Cause

The issue occurs because the peer Global Manager nodes cannot synchronize the upgrade bundle from the node acting as the Upgrade Coordinator (UC) orchestrator. This is typically due to:

  1. The install-upgrade service being down or hung on one or more nodes.

  2. Incorrect filesystem permissions on the /repository directory preventing the uuc user from accessing metadata files like gm_compatibility.versions.

Resolution

Method 1: Manual CLI and UI Remediation (Recommended)

  1. SSH to each Global Manager node as root.

  2. Verify Service Status: Execute: get service install-upgrade If the service is not running, start it: start service install-upgrade

  3. Fix Permissions: Correct the ownership and access rights of the repository directory so the upgrade coordinator user can read the files:

    Bash
     
    chown -R uuc:grepodir /repository
    chmod -R 770 /repository
    
  4. UI Reconciliation:

    • Navigate to System > Appliances.

    • Locate the node with the REPO_SYNC failure.

    • Click the RESOLVE button.

    • Wait for the status to return to SUCCESS.


Method 2: API Remediation (If UI Resolve Fails)

If the UI "Resolve" button does not clear the error, use the following API calls to reset the upgrade state and force a re-sync.

  1. Identify Stale Fabric Modules: Check for stale repository entries across the cluster: GET https://<NSX-GM-IP>/api/v1/fabric/modules

  2. Reset the Upgrade Plan: Force a reset of the Management Plane upgrade state to clear hung sync tasks: POST https://<NSX-GM-IP>/api/v1/upgrade/plan?action=reset&component_type=MP

  3. Trigger Repository Sync Manually: Use the following call for each node ID that is failing: POST https://<NSX-GM-IP>/api/v1/upgrade/nodes/<NODE-UUID>?action=reconcile_repository


Additional Information

Ensure that the / (root) partition has at least 10GB of free space on all Global Managers, as a full disk will prevent the install-upgrade service from writing temporary manifest files.