NSX Upgrade 4.1.2.x - Hosts go into disconnected state, unresponsive until reboot
search cancel

NSX Upgrade 4.1.2.x - Hosts go into disconnected state, unresponsive until reboot

book

Article ID: 368630

calendar_today

Updated On:

Products

VMware SDDC Manager VMware NSX-T Data Center VMware vSphere ESXi 8.0

Issue/Introduction

  • Initiating the upgrade from SDDC Manager or NSXUI breaks the hosts.
  • Upgrade from ESXi 7.0 to ESXi 8.0 shows similar symptoms during the NSX installation. 
  • No PSOD is generated.
  • The host is in a disconnected state.
  • Resetting management agents does nothing to bring back host networking.
  • A reboot is needed, and the upgrade has failed.
  • During VIB upgrade, we can see errors in esxupdate.log such as:
    <timestamp> Db(15) esxupdate[2111175] installer LiveImageInstaller failed: VMware_bootbank_nsx-esx-datapath_4.1.2.3.0-8.0.23382415: VMware_bootbank_nsx-esx-datapath_4.1.2.3.0-8.0.23382415: Error in running [/etc/init.d/nsx-datapath start upgrade]:
    <timestamp> Db(15)[+] esxupdate[2111175] Return code: 1
    <timestamp> Db(15)[+] esxupdate[2111175] Output: start upgrade begin
    <timestamp> Db(15)[+] esxupdate[2111175] Exception:
    <timestamp> Db(15)[+] esxupdate[2111175] Traceback (most recent call last):
    <timestamp> Db(15)[+] esxupdate[2111175]   File "/etc/init.d/nsx-datapath", line 2009, in <module>
    <timestamp> Db(15)[+] esxupdate[2111175]     UnloadKernelModules(True, False)
    <timestamp> Db(15)[+] esxupdate[2111175]   File "/etc/init.d/nsx-datapath", line 1741, in UnloadKernelModules
    <timestamp> Db(15)[+] esxupdate[2111175]     unloadModule(modName, 'nsxt-vsip' in modName)
    <timestamp> Db(15)[+] esxupdate[2111175]   File "/etc/init.d/nsx-datapath", line 1691, in unloadModule
    <timestamp> Db(15)[+] esxupdate[2111175]     raise KernelModulesException('Failed to unload module %s: %s' %
    <timestamp> Db(15)[+] esxupdate[2111175] KernelModulesException: Failed to unload module nsxt-kcp-23382415: vmkmod: VMKMod_UnloadModule: Unloading module nsxt-kcp-23382415 failed: Busy (bad0004)
    <timestamp> Db(15)[+] esxupdate[2111175] Cannot remove module nsxt-kcp-23382415: module symbols in use
    <timestamp> Db(15)[+] esxupdate[2111175] It is not safe to continue. Please reboot the host immediately to discard the unfinished update.. Clean up the installation.
    <timestamp> In(14) esxupdate[2111175] runcommand called with: args = ['/usr/lib/vmware/vob/bin/addvob', 'vob.user.esximage.install.error', 'VMware_bootbank_nsx-esx-datapath_4.1.2.3.0-8.0.23382415: VMware_bootbank_nsx-esx-datapath_4.1.2.3.0-8.0.23382415: Error in running [/etc/init.d/nsx-datapath start upgrade]:\nReturn code: 1\nOutput: start upgrade begin\nException:\nTraceback (most recent call last):\n  File "/etc/init.d/nsx-datapath", line 2009, in <module>\n    UnloadKernelModules(True, False)\n  File "/etc/init.d/nsx-datapath", line 1741, in UnloadKernelModules\n    unloadModule(modName, \'nsxt-vsip\' in modName)\n  File "/etc/init.d/nsx-datapath", line 1691, in unloadModule\n    raise KernelModulesException(\'Failed to unload module %s: %s\' %\nKernelModulesException: Failed to unload module nsxt-kcp-23382415: vmkmod: VMKMod_UnloadModule: Unloading module nsxt-kcp-23382415 failed: Busy (bad0004)\nCannot remove module nsxt-kcp-23382415: module symbols in use\n\n\nIt is not safe to continue. Please reboot the host immediately to discard the unfinished update.'], outfile = None, returnoutput = True, timeout = 0.0.
    2025-06-03T13:04:05Z In(14) esxupdate[2111175] runcommand called with: args = '/bin/localcli system visorfs ramdisk remove -t /usr/lib/vmware/lifecycle/stageliveimage', outfile = None, returnoutput = True, timeout = 0.0.

Cause

Unable to unload KCP module due to its references. 

Resolution

  1. Check for unused DVS connected to hosts or errors on DVS.
  2. Remove hosts from unused DVS or remove DVS from the environment.
  3. Restart workflow from the NSX UI Upgrade page.

Additional Information

Error similar to: KB 312644