While doing the upgrade from NSX 3.2 to 4.1, NSX VIB upgrade failed with error "Error in running [/etc/init.d/nsx-datapath-dl start upgrade]:Return code: 1"
search cancel

While doing the upgrade from NSX 3.2 to 4.1, NSX VIB upgrade failed with error "Error in running [/etc/init.d/nsx-datapath-dl start upgrade]:Return code: 1"

book

Article ID: 369823

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

During the upgrade from NSX 3.2.1 to 4.1.1, VIB upgrade failed with below error.

Unexpected error while upgrading upgrade unit:Install of offline bundle failed on host xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx with error : [LiveInstallationError]VMware_bootbank_nsx-esx-datapath_4.1.1.0.1-7.0.23358860:Error in running [/etc/init.d/nsx-datapath-dl start upgrade]:Return code: 1 Output: start upgrade begin Exception:Traceback (most recent call last):File "/etc/init.d/nsx-datapath-dl", line 1247, in <module> DualLoadUpgrade() File "/etc/init.d/nsx-datapath-dl",line 1062, in DualLoadUpgrade LoadKernelModules() File "/etc/init.d/nsx-datapath-dl",line 213, in LoadKernelModules nsxesxutils.loadModule(modName, modParam)File "/usr/lib/vmware/nsx-esx-datapath/lib64/python/nsxesxutils.py",line 579, in loadModule raise Exception('Failed to load module %s: %s' %Exception: Failed to load module nsx-esx-70u3/nsxt-ens-23358860: vmkmod: VMKModLoad:VMKernel_LoadKernelModule(nsxt-ens-23358860): Failure Cannot load module nsx-esx-70u3/nsxt-ens-23358860:Failure It is not safe to continue. Please reboot the host immediately to discard the unfinished update.cause = ('nsx-lcp-bundle(4.1.1.0.1-7.0.23358860)', 'nsx-datapath-dl', 'Error in running[/etc/init.d/nsx-datapath-dl start upgrade]:\nReturn code: 1\nOutput: start upgrade begin\nException:\nTraceback(most recent call last):\n File "/etc/init.d/nsx-datapath-dl", line 1247, in <module>\n DualLoadUpgrade()\nFile "/etc/init.d/nsx-datapath-dl", line 1062, in DualLoadUpgrade\n LoadKernelModules()\nFile "/etc/init.d/nsx-datapath-dl", line 213, in LoadKernelModules\n nsxesxutils.loadModule(modName, modParam)\nFile "/usr/lib/vmware/nsx-esx-datapath/lib64/python/nsxesxutils.py", line 579, in loadModule\n raiseException(\'Failed to load module %s: %s\' %\nException: Failed to load module nsx-esx-70u3/nsxt-ens-23358860:vmkmod: VMKModLoad: VMKernel_LoadKernelModule(nsxt-ens-23358860):Failure\nCannot load module nsx-esx-70u3/nsxt-ens-23358860:Failure\n\n') vibs = ['VMware_bootbank_nsx-esx-datapath_4.1.1.0.1-7.0.23358860']Please refer to the log file for more details..

 

Environment

VMware NSX-T
VMware NSX Data Center

Cause

This is caused due to running net-stats during upgrade and periodically. For example, net-stats commands can be used to monitor pollworld usage on the hosts. This application open a char device which is owned by NSX modules.

vmkernel.log :

2024-05-09T16:01:16.716Z cpu9:181055275)Loading module nsxt-ens-23358860 ...
2024-05-09T16:01:16.732Z cpu9:181055275)Elf: 2119: module nsxt-ens-23358860 has license VMware
2024-05-09T16:01:16.741Z cpu9:181055275)Successfully created 4 ENS affinity heaps.
2024-05-09T16:01:16.744Z cpu9:181055275)FPO: 155: FPO Service Registered
2024-05-09T16:01:16.744Z cpu9:181055275)WARNING: CharDriver: 357: Driver with name ens is already using slot 125
2024-05-09T16:01:16.744Z cpu9:181055275)WARNING: Failed to create ENS char device 
2024-05-09T16:01:16.744Z cpu9:181055275)FPO: 227: FPO Service Unregistered
2024-05-09T16:01:16.745Z cpu9:181055275)nsxt-ens-23358860 failed to load. 
2024-05-09T16:01:16.745Z cpu9:181055275)WARNING: Elf: 3139: Kernel based module load of nsxt-ens-23358860 failed: Failure <Mod_LoadDone failed>

Resolution

When trying to upgrade NSX-T bundle, please try to check whether net-stats command is running. 

Verify the Process ID (PID) for net-stats using "ps" command :
ps | grep -i "net-stats"

Kill the running "net-stats" service using "kill" command :
kill <service_pid>

It must be killed before the upgrade can be started, otherwise live upgrade will fail as the old module will not be able to unload for new module to be loaded. This is applicable for both the “In-Place” and “Maintenance” mode upgrade as both require dual load.

It is recommended to kill all net-stats instances before performing upgrade operation. Upgrade would succeed when netstat process is killed.

A precheck for net-stats will be added in NSX-T 5.0 for in-place upgrade logic to detect this condition. This will ensure killing all net-stats instance automatically before start of upgrade.