nsxt-ipfix module fails to unload during the NSX-T host upgrade.
search cancel

nsxt-ipfix module fails to unload during the NSX-T host upgrade.

book

Article ID: 319080

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Netflow (ESXi IPFIX) is configured in the vCenter or SWITCH IPfix is configured on NSX-T.
  • During the NSX-T upgrade, the FROM version of ESXi is 7.0.3 or higher and the NSX-T is 3.0.x or 3.1.x (including 3.1.3.7)
  • Host upgrade fails with the following
    • vLCM based upgrade    : "Upgrade failed: Failed to apply Component nsx-lcp-bundle(3.1.3.7.4-7.0.19746859): an error occurred while enabling service nsx-datapath-dl"

      non-VLCM based upgrade: "Unloading module nsxt-ipfix-19068435 failed: Busy (bad0004)\nCannot remove module nsxt-ipfix-19068435: module symbols in use\n\n') vibs = ['VMware_bootbank_nsx-esx-datapath_3.1.3.7.0-7.0.19380480'] Please refer to the log file for more details..

Environment

VMware NSX

Cause

The issue occurs due to changes that went into the VMKAPI for the ESXi 7.0 U3. Due to these changes, the IPFIX modules receive a non-zero reference count when an unload of the module is attempted leading to failure in the unload of the IPFIX Module.

Resolution

This issue is resolved in:

NSX-T version in 3.1.3.7.4 and in releases 3.2.1 forward.

Workaround:

Scenario 1:

During the NSX-T upgrade from 3.0.x and 3.1.x (until 3.1.3.7)  AND if the version of the ESXi is 7.0.3.x. 

  1. Unconfigure the "Netflow"  or/and "SWITCH IPFIX Profiles" from the VRNI or the third-party flow collectors before the upgrade. If applicable, disable Netflow/IPFIX in both the NSX-T and VC datasources in VRNI.
  2. Wait for at least 5 minutes after unconfiguring IPFIX, and reboot the ESXi host.
  3. Attempt the NSX-T host upgrade.

Scenario 2:

If the user had used IPFIX (DVS/Switch) after the last reboot of the ESXi host, the NSX-T host upgrade may fail and the user has to reboot the ESXi host to get out of the error state and retry the upgrade.

Scenario 3:

During the NSX-T upgrade to 3.2.1 or later releases AND if the version of the ESXi is 7.0.3.x.,

NSX-T introduced a pre-check to alert the user about the IPFIX config and the hosts may be automatically rebooted during the upgrade without user intervention.

Note:

  1. As the host must be rebooted to workaround the issue, the in-place host upgrade mode cannot be used.
  2. Both Netflow and/or SWITCH IPFIX profiles can be re-configured post NSX-T host upgrade.
  3. Unloading IPFIX module issue does NOT manifest when just "Firewall IPFIX Profiles" is configured, hence there is NO need to clear the "Firewall IPFIX Profiles" or the Collector configuration for type "IPFIX Profile"
  4. In the case of the upgrade via VLCM the precheck will only alert the user, but the user has to reboot the ESXi host manually to complete the NSX-T host upgrade. Ensure that no VM migrates to the rebooted ESXi host before the completion of host upgrade.
  5. If the user has overridden Netflow policy for a specific dvport ID, ensure that Netflow is disabled for that dvport ID. 

In case VRNI or the third-party flow collectors is/are unavailable to unconfigure IPFIX follow the manual steps below.

Steps to remove Netflow manually from vCenter

  1. Navigate to "vCenter Home" > "Menu" > "Networking" > Right click on "VDS" name > "Settings" > "Edit Netflow"
  2. Remove "Collector IP address" and "Collector Port"
  3. Navigate to "vCenter Home" > "Menu" > "Networking" > Right-click on "Portgroup" name under VDS > "Edit Settings" > "Monitoring"
  4. Set the NetFlow to "Disabled"
  5. Perform the same on all the portgroups.

Steps to remove SWITCH IPFIX Profiles

  1. Navigate to "NSX-T Policy UI" > "Plan & Troubleshoot" > "IPFIX" > "SWITCH IPFIX Profiles"
  2. In the "Applied to" field uncheck all the selected objects.

Additional Information

On the NSX-T UI, an error similar to the following is displayed. 

Install of offline bundle failed on host 0b830eb1-####-####-####-##########09 with error : [LiveInstallationError] VMware_bootbank_nsx-esx-datapath_3.1.3.7.0-7.0.19380480: Error in running [/etc/init.d/nsx-datapath-dl start upgrade]: Return code: 1 Output: /usr/lib/vmware/nsx-esx-datapath/lib/python3.5/nsxesxutils.py:352: SyntaxWarning: "is not" with a literal. Did you mean "!="? if DEV_BUILDTYPE is not "beta": start upgrade begin Exception: Traceback (most recent call last): File "/etc/init.d/nsx-datapath-dl", line 1154, in <module> DualLoadUpgrade() File "/etc/init.d/nsx-datapath-dl", line 953, in DualLoadUpgrade PreUpgrade() File "/etc/init.d/nsx-datapath-dl", line 794, in PreUpgrade UnloadNonDLModules() File "/etc/init.d/nsx-datapath-dl", line 146, in UnloadNonDLModules nsxesxutils.unloadModule(modName, False) File "/usr/lib/vmware/nsx-esx-datapath/lib/python3.5/nsxesxutils.py", line 446, in unloadModule raise Exception('Failed to unload module %s: %s' % Exception: Failed to unload module nsxt-ipfix-19068435: vmkmod: VMKMod_UnloadModule: Unloading module nsxt-ipfix-19068435 failed: Busy (bad0004) Cannot remove module nsxt-ipfix-19068435: module symbols in use It is not safe to continue. Please reboot the host immediately to discard the unfinished update. cause = ('nsx-lcp-bundle(3.1.3.7.0-7.0.19380480)', 'nsx-datapath-dl', 'Error in running [/etc/init.d/nsx-datapath-dl start upgrade]:\nReturn code: 1\nOutput: /usr/lib/vmware/nsx-esx-datapath/lib/python3.5/nsxesxutils.py:352: SyntaxWarning: "is not" with a literal. Did you mean "!="?\n if DEV_BUILDTYPE is not "beta":\nstart upgrade begin\nException:\nTraceback (most recent call last):\n File "/etc/init.d/nsx-datapath-dl", line 1154, in <module>\n DualLoadUpgrade()\n File "/etc/init.d/nsx-datapath-dl", line 953, in DualLoadUpgrade\n PreUpgrade()\n File "/etc/init.d/nsx-datapath-dl", line 794, in PreUpgrade\n UnloadNonDLModules()\n File "/etc/init.d/nsx-datapath-dl", line 146, in UnloadNonDLModules\n nsxesxutils.unloadModule(modName, False)\n File "/usr/lib/vmware/nsx-esx-datapath/lib/python3.5/nsxesxutils.py", line 446, in unloadModule\n raise Exception(\'Failed to unload module %s: %s\' %\nException: Failed to unload module nsxt-ipfix-19068435: vmkmod: VMKMod_UnloadModule: Unloading module nsxt-ipfix-19068435 failed: Busy (bad0004)\nCannot remove module nsxt-ipfix-19068435: module symbols in use\n\n') vibs = ['VMware_bootbank_nsx-esx-datapath_3.1.3.7.0-7.0.19380480'] Please refer to the log file for more details..

 

On the ESXi host /var/run/log/esxupdate.log following error is observed:

2022-xx-xxTxx:xx:24Z esxupdate: 2136236: LiveImageInstaller: DEBUG: Output: being upgraded Killed failed to start

2022-xx-xxTxx:xx:31Z esxupdate: 2136236: HostImage: DEBUG: installer LiveImageInstaller failed: VMware_bootbank_nsx-esx-datapath_3.1.3.7.0-7.0.19380480: VMware_bootbank_nsx-esx-datapath_3.1.3.7.0-7.0.19380480: Error in running [/etc/init.d/nsx-datapath-dl start upgrade]: Return code: 1 Output: /usr/lib/vmware/nsx-esx-datapath/lib/python3.5/nsxesxutils.py:352: SyntaxWarning: "is not" with a literal. Did you mean "!="?   if DEV_BUILDTYPE is not "beta": start upgrade begin Exception: Traceback (most recent call last):   File "/etc/init.d/nsx-datapath-dl", line 1154, in <module>     DualLoadUpgrade()   File "/etc/init.d/nsx-datapath-dl", line 953, in DualLoadUpgrade     PreUpgrade()   File "/etc/init.d/nsx-datapath-dl", line 794, in PreUpgrade     UnloadNonDLModules()   File "/etc/init.d/nsx-datapath-dl", line 146, in UnloadNonDLModules     nsxesxutils.unloadModule(modName, False)   File "/usr/lib/vmware/nsx-esx-datapath/lib/python3.5/nsxesxutils.py", line 446, in unloadModule     raise Exception('Failed to unload module %s: %s'

2022-03-17T04:47:31Z esxupdate:  % Exception: Failed to unload module nsxt-ipfix-19068435: vmkmod: VMKMod_UnloadModule: Unloading module nsxt-ipfix-19068435 failed: Busy (bad0004) Cannot remove module nsxt-ipfix-19068435: module symbols in use   It is not safe to continue. Please reboot the host immediately to discard the unfinished update.. Clean up the installation.

2022-03-17T04:47:31Z esxupdate: tils.py", line 446, in unloadModule\n    raise Exception(\'Failed to unload module %s: %s\' %\nException: Failed to unload module nsxt-ipfix-19068435: vmkmod: VMKMod_UnloadModule: Unloading module nsxt-ipfix-19068435 failed: Busy (bad0004)\nCannot remove module nsxt-ipfix-19068435: module symbols in use\n\n\nIt is not safe to continue. Please reboot the host immediately to discard the unfinished update.']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.

2022-03-17T04:47:31Z esxupdate: 2136236: root: ERROR:     raise Exception('Failed to unload module %s: %s' %

2022-03-17T04:47:31Z esxupdate: 2136236: root: ERROR: Exception: Failed to unload module nsxt-ipfix-19068435: vmkmod: VMKMod_UnloadModule: Unloading module nsxt-ipfix-19068435 failed: Busy (bad0004)

  • Impact/Risks:
    • NSX-T fails to upgrade an ESXi host