Unable to delete NSX VIBs using "nsxcli -c del nsx" command
search cancel

Unable to delete NSX VIBs using "nsxcli -c del nsx" command

book

Article ID: 379790

calendar_today

Updated On:

Products

VMware NSX VMware NSX-T Data Center

Issue/Introduction

  • Trying to remove NSX VIB from ESXi host(s) fails with error message displayed on screen and ESXi log bundle /var/log/esxupdate.log:
2023-04-17T06:44:49Z esxupdate: 3227255: LiveImageInstaller: WARNING: Handling Live Vib Failure: VMware_bootbank_nsx-opsagent_3.0.0.0.0-7.0.15945993: Error in running [/etc/init.d/nsx-opsagent stop upgrade]: Return code: 1 Output: OK to upgrade nsx-opsagent stop nsx-opsagent stop watchdog watchdog-opsAgent[3227723]: Terminating watchdog process with PID 2917842 nsx-opsagent service is stopped cp: can't stat '/etc/vmware/nsxa/host_config.bin': No such file or directory
2023-04-17T06:45:02Z esxupdate: 3227255: HostImage: DEBUG: installer LiveImageInstaller failed: VMware_bootbank_nsx-opsagent_3.0.0.0.0-7.0.15945993: VMware_bootbank_nsx-opsagent_3.0.0.0.0-7.0.15945993: Error in running [/etc/init.d/nsx-opsagent stop upgrade]: Return code: 1 Output: OK to upgrade nsx-opsagent stop nsx-opsagent stop watchdog watchdog-opsAgent[3227723]: Terminating watchdog process with PID 2917842 nsx-opsagent service is stopped cp: can't stat '/etc/vmware/nsxa/host_config.bin': No such file or directory  It is not safe to continue. Please reboot the host immediately to discard the unfinished update.. Clean up the installation.
2023-04-17T06:45:02Z esxupdate: 3227255: vmware.runcommand: INFO: runcommand called with: args = '['/usr/lib/vmware/vob/bin/addvob', 'vob.user.esximage.install.error', "VMware_bootbank_nsx-opsagent_3.0.0.0.0-7.0.15945993: VMware_bootbank_nsx-opsagent_3.0.0.0.0-7.0.15945993: Error in running [/etc/init.d/nsx-opsagent stop upgrade]:\nReturn code: 1\nOutput: OK to upgrade\nnsx-opsagent stop\nnsx-opsagent stop watchdog\nwatchdog-opsAgent[3227723]: Terminating watchdog process with PID 2917842\nnsx-opsagent service is stopped\ncp: can't stat '/etc/vmware/nsxa/host_config.bin': No such file or directory\n\nIt is not safe to continue. Please reboot the host immediately to discard the unfinished update."]', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
  • Running 'del nsx' on the ESXi host in the nsxcli shell does not remove the VIB's.

  • In the NSX-T UI removing NSX-T using force, also fails to remove the VIB's.

  • Reboot of the host does not resolve the issue.

  • Issued the "nsxcli -c del nsx" command to remove the NSX VIBs, which returned an error.
% Exception when deleting nsx from host: 'error code: 4 stdout: delete_nsx_instance_from_host.sh: INFO: NSX reset script called with argument fabric_node on nsx-esx delete_nsx_instance_from_host.sh: INFO: Run transport_node reset on ESX node % Failed to remove all host switches or logical switches delete_nsx_instance_from_host.sh: ERROR: Failed to reset nsxa app of nsx-opsagent. Please check ospagent logs for more details.  stderr: ERROR: Failed to reset nsxa app of nsx-opsagent. Please check ospagent logs for more details.'
  • nsx-opsagent service is not running. Restarting nsx-opsagent service on the ESXi host using the command "/etc/init.d/nsx-opsagent restart" restarts the service successfully but opsagent status sets back to not running after sometime when validated against command "/etc/init.d/nsx-opsagent status"

Environment

VMware NSX-T Data Center 

VMware NSX

Cause

The "nsxcli -c del nsx" command fails because there are some properties left that need to be deleted.

Log location : /var/run/log/nsxaVim.log

2023-04-14T10:17:22Z nsxaVim: [2105560]: INFO All NSX properties are not removed by cfgAgent yet, retry times left 59
  • Note the highlighed properties.   In this example, there two properties are present which is causing the  deletion command to fail.
  • The properties com.vmware.net.portset.fc.enabled and com.vmware.net.portset.fc.mcast.enabled  must be deleted.
  • There can be several other properties that need to be deleted that can be listed in the log entry or entries.  These two are just the ones that happen to be listed in this specific example.

Resolution

Obtain the name of the vDS which needs to be modified via the esxcfg-vswitch -l command.



For each vDS in use by NSX, issue a command similar to the following to disable the active NSX kernel module(s):

Command syntax : net-dvs -u <propertyName> -p hostKeyValueDataList <SwitchName>

Example commands to remove the properties com.vmware.net.portset.fc.enabled and com.vmware.net.portset.fc.mcast.enabled :

  • net-dvs -u com.vmware.net.portset.fc.enabled -p hostPropList DSswith-Nested-HCX
  • net-dvs -u com.vmware.net.portset.fc.mcast.enabled -p hostPropList DSswith-Nested-HCX

Steps post properties removal.  To be done on each affect ESXi host:

  1. Execute  nsxcli -c del nsx command to remove the NSX VIBs.
  2. Execute esxcli software vib list | grep -E 'nsx|vsipfwlib' command to validate that no NSX VIBs are installed on the host. There should be no output.
  3. Verify that the removal is complete with    esxcli software vib list | grep nsx
  4. If NSX VIB deletion is successful (NSX VIBs no longer listed), configure NSX on the host transport node using NSX UI.
  5. If NSX VIB deletion is not successful (NSX VIBs still listed), please engage Broadcom support for further investigation.

Additional Information

There is a possibility that when removing properties  com.vmware.nsx.vdl2.enabled and com.vmware.nsx.vdr.dvptracking that it will fail.
The failure will state the resource is busy.  The cause of this is the presence the kernel interface created when NSX is installed on the host (vmk10, vmk11, vmk50).
These interface will need to be remove via the following commands executed ESXi host command line.

Removal of a kernel interface:

esxcli network ip interface remove --interface-name=vmk10

Verification can be done with the following command:
esxcfg-vmknic -l

This issue is normally due to the incorrect removal of hosts from the management of NSX.  Data between the ESXi hosts and the NSX managers are no longer in sync.
This article addressed the issues found on the ESXi host when removed incorrectly.

There maybe an issue left to address on the NSX manager concerning the release of the TEP IPs back to the TEP IP pool.
Article TEP IP Addresses Not Released After FORCE Deleting Host/Edge Transport Node in NSX-T UI  addresses the TEP IP manual release process.