NSX-T Manager Upgrade Stalls at 30%
search cancel

NSX-T Manager Upgrade Stalls at 30%

book

Article ID: 306230

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
When upgrading the NSX-T Manager (the final step in upgrading NSX-T), the upgrade stalls at 30%. At this point in the manager upgrade, the GUI is inaccessible.


Environment

VMware NSX-T Data Center 2.x
VMware NSX-T Data Center

Cause

There are ten internal steps for upgrading the NSX-T Manager. With each step a script runs in the background to carry out the task. The ten steps are as follows:
  - name: pre_upgrade_validation
  - name: shutdown_manager
  - name: install_os
  - name: migrate_manager_config
  - name: switch_os
  - name: reboot
  - name: run_migration_tool
  - name: start_manager
  - name: update_upgrade_status
  - name: finish_upgrade



As witnessed in the following logs, the manager fails to pass the second step: shutdown_manager .
From /var/log/syslog:

<182>1 2018-09-19T17:40:54.461124+00:00 spcn#####d01 NSX 5371 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="INFO"] Running "pre_upgrade_validation" (step 1 of 10)
<182>1 2018-09-19T17:40:54.478845+00:00 spcn######d01 NSX 5371 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="INFO"] Running "shutdown_manager" (step 2 of 10)
<182>1 2018-09-19T17:40:54.480041+00:00 spc######d01 NSX 5371 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="INFO"] Found existing task for shutdown_manager, with id {u'from_version': u'2.1.0.0.0.7395503', u'bundle_files_path': u'/image/VMware-NSX-unified-appliance-2.2.0.0.0.8680778/files', u'new_config_path': u'/config_bak', u'old_config_path': u'/config', u'to_version': u'2.2.0.0.0.8680778', u'old_os_path': u'/', u'node_type': u'nsx-manager', u'status_file': u'/tmp/upgradeQxOD5l', u'new_os_path': u'/os_bak'}, with args c379f053-1725-4a25-####-########### state TASK_IN_PROGRESS. Not running again.
<179>1 2018-09-19T17:40:54.480627+00:00 spcnsxtmgrd01 NSX 5371 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="ERROR"] Playbook failed at step shutdown_manager. Run the command 'set debug-mode' followed by 'start upgrade-bundle VMware-NSX-unified-appliance-2.2.0.0.0.8680778 step get_upgrade_task_history' for more info.
<179>1 2018-09-19T17:40:54.480762+00:00 spc#######01 NSX 5371 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="ERROR"] Playbook failed at step shutdown_manager. Run the command 'set debug-mode' followed by 'start upgrade-bundle VMware-NSX-unified-appliance-2.2.0.0.0.8680778 step get_upgrade_task_history' for more info.
<182>1 2018-09-19T17:41:54.510296+00:00 sp######d01 NSX 5836 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="INFO"] Running "pre_upgrade_validation" (step 1 of 10)
<182>1 2018-09-19T17:41:54.528282+00:00 sp######d01 NSX 5836 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="INFO"] Running "shutdown_manager" (step 2 of 10)
<182>1 2018-09-19T17:41:54.530179+00:00 sp########01 NSX 5836 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="INFO"] Found existing task for shutdown_manager, with id {u'from_version': u'2.1.0.0.0.7395503', u'bundle_files_path': u'/image/VMware-NSX-unified-appliance-2.2.0.0.0.8680778/files', u'new_config_path': u'/config_bak', u'old_config_path': u'/config', u'to_version': u'2.2.0.0.0.8680778', u'old_os_path': u'/', u'node_type': u'nsx-manager', u'status_file': u'/tmp/upgradeQxOD5l', u'new_os_path': u'/os_bak'}, with args c379f053-1725-4a25-####-############ state TASK_IN_PROGRESS. Not running again.
<179>1 2018-09-19T17:41:54.531257+00:00 spc#######01 NSX 5836 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="ERROR"] Playbook failed at step shutdown_manager. Run the command 'set debug-mode' followed by 'start upgrade-bundle VMware-NSX-unified-appliance-2.2.0.0.0.8680778 step get_upgrade_task_history' for more info.
<179>1 2018-09-19T17:41:54.531626+00:00 spc########01 NSX 5836 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="upgrade-bundle" level="ERROR"] Playbook failed at step shutdown_manager. Run the command 'set debug-mode' followed by 'start upgrade-bundle VMware-NSX-unified-appliance-2.2.0.0.0.8680778 step get_upgrade_task_history' for more info.
<182>1 2018-09-19T17:42:07.506207+00:00 spcns######01 NSX 4204 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="node-cli" username="admin" level="INFO"] CMD: start upgrade-bundle VMware-NSX-unified-appliance-2.2.0.0.0.8680778 step get_upgrade_task_history
<182>1 2018-09-19T17:42:08.912814+00:00 spc######d01 NSX 4204 SYSTEM [nsx@6876 comp="nsx-manager" subcomp="node-cli" username="admin" level="INFO" audit="true"] CMD: start upgrade-bundle VMware-NSX-unified-appliance-2.2.0.0.0.8680778 step get_upgrade_task_history (duration: 1.406s)




Resolution

This issue is resolved in NSX-T 2.4 (Flash)

Workaround:
If you run into this issue, please follow the steps below to workaround this issue:

1) Reboot manager (so that the hung systemctl step gets halted).

2) After it comes up after reboot, make sure cloud-init service is not in failed state, but active (exited) state - "service cloud-init status" on root shell should show active(exited) state. If it is in failed state, run "service cloud-init restart" on root shell, and verify its state is active (exited) before proceeding.

3) Follow steps as follows -
  From admin shell, Run CLI commands** -
  - set debug-mode
  - start upgrade-bundle <bundle-name> step finish_upgrade
  (Example:  "start upgrade-bundle VMware-NSX-unified-appliance-2.2.0.0.0.8680778 step finish_upgrade" )

4) Go to upgrade page on Management Plane (MP) UI.

5) Click Start/Retry button on MP upgrade UI page to retry MP upgrade.