After NSX Manager Cluster upgrade, NSX prechecks fail or subsequent NSX component upgrades remain stuck in PAUSED state.
search cancel

After NSX Manager Cluster upgrade, NSX prechecks fail or subsequent NSX component upgrades remain stuck in PAUSED state.

book

Article ID: 433494

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

After the NSX Manager component has completed its upgrade to target version, users may notice that:

  1. Subsequent NSX Precheck operations fail or 

  2. Remaining NSX Component upgrades (Host / Edge or Finalize) cannot be started and remain in PAUSED state. 

To narrow down and identify the specific issue, perform the following actions: 

  1. Trigger the following GET API call on all nodes in the NSX Manager Cluster:

    https://{{nsx-mp-ip}}/api/v1/upgrade/status-summary 

    Note: On a cluster with 3 Manager nodes, this call needs to be invoked on the address of each of the 3 nodes.

  2. Verify that the "overall_upgrade_status" is in "IN_PROGRESS" state on exactly one of the Management Plane nodes

  3. Verify that the "overall_upgrade_status" is in "PAUSED" state on the other Management Plane nodes. 

Environment

VMware NSX 9.1

Cause

This issue could be seen when the upgrade-coordinator (UC) service startup takes longer than expected (about 2 - 3 minutes) during the NSX Manager component upgrade phase. It can manifest as a precheck failure or NSX component upgrade being stuck

Resolution

  1. SSH (as root) to the NSX Manager node showing "overall_upgrade_status" as "IN_PROGRESS

  2. Restart the upgrade-coordinator service on the node with the following command:

    service upgrade-coordinator restart  (from admin cli)

    or

    systemctl restart upgrade-coordinator (from root cli)

  3. Wait 10 minutes. 

  4. SSH to all 3 NSX Manager nodes and trigger the following GET API call on all nodes:
    GET https://{{nsx-mp-ip}}/api/v1/upgrade/status-summary

    Ensure that they all show the same state of "overall_upgrade_status" (PAUSED). 

  5. It is now safe to proceed with subsequent upgrade operations.