NSX Host Transport Node Cluster upgrade is stuck "In Progress... 0%"
search cancel

NSX Host Transport Node Cluster upgrade is stuck "In Progress... 0%"

book

Article ID: 436589

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • During the NSX Upgrade of the ESXi Host Transport Node the upgrade one of the ESXi Host cluster is stuck "In Progress... 0%"
  • When the upgrade is paused the Pause task is also stuck with below Message
    "Pause has been initiated. System will pause after completing execution of upgrade units that are in progress."
  • vlcm-e2######-####-####-####-##########a7 is the Stale Host UUID in the Host Upgrade Unit and the below log entries will be present in the var/log/upgrade-coordinator/upgrade-coordinator.log at the time of Host cluster upgrade.
    INFO http-nio-127.0.0.1-7442-exec-7 UpgradeCoordinatorFacadeImpl 3274255 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Got upgrade status summary overallUpgradeStatus: IN_PROGRESS
    INFO http-nio-127.0.0.1-7442-exec-7 UpgradeCoordinatorFacadeImpl 3274255 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Component: EDGE, status: SUCCESS, % complete: 100.0, details: , canSkip: true
    INFO http-nio-127.0.0.1-7442-exec-7 UpgradeCoordinatorFacadeImpl 3274255 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Component: HOST, status: IN_PROGRESS, % complete: 22.0, details: [<ESXi_Host_IP>] Finishing upgrade, canSkip: false
    INFO http-nio-127.0.0.1-7442-exec-7 UpgradeCoordinatorFacadeImpl 3274255 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Component: MP, status: NOT_STARTED, % complete: 0.0, details: null, canSkip: false
    ERROR task-executor-8-1-workitem-null-vlcm-e2######-####-####-####-##########a7 WorkItem 3274255 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP30033" level="ERROR" subcomp="upgrade-coordinator"] Error reported by plugin during upgrade of upgrade unit vlcm-e2######-####-####-####-##########a7
    com.vmware.nsx.management.upgrade.exceptions.UpgradeUnitUpgradeException: null
            at com.vmware.nsx.management.upgrade.executionengine.SingleWorkItem.executePreUpgradeChecks(SingleWorkItem.java:170) ~[libuc-core.jar:?]
            at com.vmware.nsx.management.upgrade.executionengine.SingleWorkItem.executeWorkItem(SingleWorkItem.java:114) ~[libuc-core.jar:?]
            at com.vmware.nsx.management.upgrade.executionengine.SingleWorkItem.run(SingleWorkItem.java:90) ~[libuc-core.jar:?]
            at com.vmware.nsx.management.common.executor.TaskExecutorImpl$TaskWrapper$1.run(TaskExecutorImpl.java:240) ~[libmp_common.jar:?]
            at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
            at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
            at com.vmware.nsx.management.common.executor.TaskExecutorImpl$TaskWrapper.run(TaskExecutorImpl.java:273) ~[libmp_common.jar:?]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
            at com.vmware.nsx.util.concurrent.Executors$MeteredRunnable.run(Executors.java:353) ~[libnsx_util.jar:?]
            at java.lang.Thread.run(Unknown Source) ~[?:?]

    INFO task-executor-8-1-workitem-null-vlcm-e2######-####-####-####-##########a7 ExecutionMonitorServiceImpl 3274255 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Updating error list of upgrade unit vlcm-e2######-####-####-####-##########a7
    INFO task-executor-8-1-workitem-null-vlcm-e2######-####-####-####-##########a7 ExecutionMonitorServiceImpl 3274255 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Execution monitor service invoked to react to failure of node vlcm-e2######-####-####-####-##########a7 [Pre-upgrade checks failed for null: Please check logs for more details.]
    INFO task-executor-8-1-workitem-null-vlcm-e2######-####-####-####-##########a7 UpgradeQueryServiceImpl 3274255 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Retrieving upgrade unit with id vlcm-e2######-####-####-####-##########a7
    INFO task-executor-8-1-workitem-null-vlcm-e2######-####-####-####-##########a7 UpgradeQueryServiceImpl 3274255 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Found group level upgrade unit with id: vlcm-e2######-####-####-####-##########a7
    ERROR task-executor-8-1-workitem-null-vlcm-e2######-####-####-####-##########a7 UpgradeQueryServiceImpl 3274255 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP30003" level="ERROR" subcomp="upgrade-coordinator"] Invalid component type: null
    ERROR task-executor-8-1-workitem-null-vlcm-e2######-####-####-####-##########a7 WorkItem 3274255 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP30062" level="ERROR" subcomp="upgrade-coordinator"] Error encountered while upgrading upgrade unit null
    com.vmware.nsx.management.upgrade.exceptions.InvalidComponentTypeException: null
            at com.vmware.nsx.management.upgrade.service.impl.UpgradeQueryServiceImpl.getComponentUpgradePlan(UpgradeQueryServiceImpl.java:935) ~[libuc-core.jar:?]
            at com.vmware.nsx.management.upgrade.service.impl.UpgradeQueryServiceImpl.getComponentPlanSettingsForUpgradeUnit(UpgradeQueryServiceImpl.java:898) ~[libuc-core.jar:?]
            at com.vmware.nsx.management.upgrade.executionengine.ExecutionMonitorServiceImpl.onNodeFailure(ExecutionMonitorServiceImpl.java:71) ~[libuc-core.jar:?]
            at com.vmware.nsx.management.upgrade.monitor.impl.UpgradeUnitProgressCollectorImpl.reportFailure(UpgradeUnitProgressCollectorImpl.java:39) ~[libuc-core.jar:?]
            at com.vmware.nsx.management.upgrade.executionengine.SingleWorkItem.executeWorkItem(SingleWorkItem.java:128) ~[libuc-core.jar:?]
            at com.vmware.nsx.management.upgrade.executionengine.SingleWorkItem.run(SingleWorkItem.java:90) ~[libuc-core.jar:?]
            at com.vmware.nsx.management.common.executor.TaskExecutorImpl$TaskWrapper$1.run(TaskExecutorImpl.java:240) ~[libmp_common.jar:?]
            at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
            at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
            at com.vmware.nsx.management.common.executor.TaskExecutorImpl$TaskWrapper.run(TaskExecutorImpl.java:273) ~[libmp_common.jar:?]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
            at com.vmware.nsx.util.concurrent.Executors$MeteredRunnable.run(Executors.java:353) ~[libnsx_util.jar:?]
            at java.lang.Thread.run(Unknown Source) ~[?:?]

    Note: The preceding log excerpts are only examples. Date, time, and environmental variables (Host UUID) may vary depending on your environment.

Environment

VMware NSX
Vmware NSX-T Data Center

Cause

This issue is caused due to stale Host in the Host upgradeunit, the stale Host is not available in the DB but is present in the upgradeunit Plan.

Resolution

This is a known issue impacting VMware NSX.

Workaround:

Additional Information

Similar null pointer exception issue may be noticed during the upgrade of the NSX Edge also kindly see the KB NSX upgrade is pausing and displaying error "Invalid component type: null" for more information.