NSX-T upgrade failed from SDDC Manager for multiple transport nodes in the VLCM cluster.
search cancel

NSX-T upgrade failed from SDDC Manager for multiple transport nodes in the VLCM cluster.

book

Article ID: 420493

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • In the SDDC UI, the NSX host upgrade failed with an error indicating that the ESXi hosts are not reachable from the vCenter, as shown in the screenshot below:

    "Error: Check for error in the LCM log files at 127.0.0.1/var/log/vmware/vcf/lcm and address those errors. Check if the SDDC manager is able to communicate with NSX manager. if not login to NSX and check if upgrade is running and wait for the completion"



  • The NSX ESXi host upgrade may fail and remain paused at XX% on the NSX upgrade page, even after clicking "Continue" from the NSX UI, as shown in the screenshot below.



  • The upgrade or host remediation process begins with a host compliance check initiated from the vCenter.
  • However, in this case, the host compliance check is timing out for the hosts in the cluster due to a "host not reachable" error.

  • Although the errors suggest that the hosts are not reachable from the vCenter, there may be no actual network issues between the vCenter and the affected ESXi hosts on all required ports.

Environment

VMware NSX

Cause

  • The upgrade/remediation process begins with a compliance check. In this instance, vLCM scan unit in the vCenter designated the hosts as unknown and skipped them from the upgrade because it could not reach the ESXis.
  • vLCM apply-solution task incorrectly returned a SUCCEEDED status. This occurred despite hosts being skipped and an error present within the host_status object, because it was marked successful, the NSX-T upgrade-unit (which orchestrates the vLCM upgrade) was also marked as successful, and the progress was updated in the database.
  • Consequently, if a user attempts to retry the upgrade from the NSX-T UI, the upgrade coordinator skips the process, believing the upgrade is already complete based on the previous apply-solution task status.

Resolution

This is a known issue and below is the workaround :
a) Visit the vCenter UI.
b) Remediate the vLCM clusters and wait for completion.
c) Visit the NSX-T Manager UI and refresh the upgrade page. The host upgrade should now be completed successfully.

Fixed will be provided in future release.

Note: If above workaround did not help please open a Broadcom Support case. For more information, refer Creating and managing Broadcom support cases.

Additional Information

Please refer to the similar issue where settingsd service restart also worked : NSX Host Upgrade May Fail in SDDC with Error: Host group upgrade status is FAILED for group #### [Upgrade failed: Host #### is not reachable. Check host's connectivity and health in vCenter and retry]