NSX "Upgrade in progress" notification when it has already completed
search cancel

NSX "Upgrade in progress" notification when it has already completed

book

Article ID: 379349

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX was previously upgraded successfully however the UI now indicates it is not complete
  • The next planned upgrade is blocked because host upgrade post checks are now in a failed state
  • On the NSX-UI, System --> Fabric --> Hosts there are one or more hosts with status DOWN
  • Status PAUSED is found in logs  /var/log/upgrade-coordinator/upgrade-coordinator.log:
    <DATE>T11:51:53.563Z  INFO http-nio-127.0.0.1-7442-exec-1 UpgradeCoordinatorFacadeImpl 3408 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Group:  ########-####-####-####-###########:domain-####, status: PAUSED, unit count: 6, failed count: 0, % complete: 83.0
  • A notification indicating "Upgrade in progress" was generated Current version and target version are the same

Environment

VMware NSX

Cause

A previous upgrade completed successfully.
Later some changes happen for the related hosts, e.g. hosts are removed from vCenter or hosts have a hardware failure and are not responding state in vCenter.
At some later point the Upgrade Coordinator service restarted either through individual service restart or for example an NSX Manager reboot.
Due to a software processing issue, Post upgrade host checks then show some hosts in a Failed state.
Any subsequent upgrade is blocked until the issue is resolved.

 

Resolution

This issue is resolved in VMware NSX 4.1.2.5 and 4.2.0, available at Broadcom downloads.

To workaround this issue, determine which hosts are in a bad state or are missing and resolve as follows

  1. On the NSX Manager UI, navigate to System -> Upgrade -> Hosts, identify which hosts have Failed post check state 
  2. On the NSX Manager UI, navigate to System -> Fabric -> Hosts and check the status of the hosts identified in step 1
  3. On the vSphere Client, check the status of the hosts identified in step 1
  4. For hosts in a not responding state in vCenter, resolve the host issue.
  5. If the host issue cannot be resolved or if the host no longer exists in vCenter, if present remove the host from the cluster and delete it from NSX via API
    DELETE "/api/v1/transport-nodes/<Host UUID>?force=true&unprepare_host=false"

Additional Information

VMware NSX