Troubleshooting NSX Upgrade Failures
search cancel

Troubleshooting NSX Upgrade Failures

book

Article ID: 378902

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • This article provide information on troubleshooting upgrade failures.
  • NSX Upgrade involves the following major component
    • Upgrade Coordinator
    • Prechecks
    • Edge
    • Host
    • Manager
  • Prior to NSX 4.0.1.1 there was a defined upgrade order of Edge, followed by Host, followed by Manager.
  • From 4.0.1.1, some flexibility was added that allows the order of Edge and Host component upgrade to be interchangeable.
  • In all cases, all Edges and all Hosts must be upgraded before the Manager upgrade will start.
  • The NSX main upgrade bundle, .mub, contains the software needed to upgrade all components. It also contains the prechecks.
  • The NSX pre-upgrade checks bundle, .pub, is an asynchronous prechecks bundle decoupled from the main upgrade bundle. This allows Broadcom to dynamically update upgrade prechecks if we become aware of an issue that may impact upgrades.

Resolution

Prereqs

  • Confirm the target NSX version is compatible with other products in the environment. Also confirm the NSX upgrade path is supported. Reference Interop Matrix.
  • Ensure ports required for the upgrade are open e.g. 443/8080 see Required Ports.

 

Upgrade should always be started from the orchestrator node.
To identify the orchestrator node, execute the command "get service install-upgrade" from admin CLI on any one of the three NSX Managers and verify "Enabled on" manager:

 

Upgrade Guides:

Log locations:

NSX Manager

    • /var/log/upgrade-coordinator/upgrade-coordinator.log (This log is mostly useful prior to the step shutdown_manager, after that Upgrade Coordinator services will be in stopped state.)
    • /var/log/resume-upgrade.log (This log is only useful after the step shutdown_manager on the orchestrator node.)
    • /var/log/policy/data-migration.log (If the upgrade step has failed in run_migration_tool, then this log is useful to identify the cause of the failure.)
    • /var/log/proton/data-migration.log
    • /var/log/nsx-cli/nsxcli.log (Used to verify the playbook task start and completion status on the orchestrator node.)
    • /var/log/repository/access.log
    • /var/log/syslog


Known issues

Component Troubleshooting:

Handling Log Bundles for offline review with Broadcom support: