Uninstalling NSX-T from ESXi host failing with error "Failed to remove all host switches or logical switches"
search cancel

Uninstalling NSX-T from ESXi host failing with error "Failed to remove all host switches or logical switches"

book

Article ID: 322468

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • While attempting to install NSX-T on host, installation fails at 18% with the following errors:
    "failed to install software on a host <hostname>:java.rmi.RemoteExcepion:

    [Live installation error] Error in running ['/rtc/init.d/nsx-opsagent'. 'stop'. 'upgrade']: Return code : 1 Output OK to upgrade nsx-opsagent stop nsx-opsagent stop watchdog

    Terminating watchdog process with process PID 2105211 sh: you need to specify whom to kill nsx-ops-agent service is stopping cp: can't stat"
  • The resolve option does not help, if you attempt to run 'del nsx' in the nsxcli of the ESXi host, results in the below errors:
    Exception when deleting nsx from host: ' error code: 4 stdout: delete_nsx_instance_from_host.sh: INFO: NSX reset script called with argument fabric_node on nsx-esx delete_nsx_instance_from_host.sh: INFO: Run transport_node reset on ESX node % Failed to remove all host switches or logical switches delete_nsx_instance_from_host.sh: ERROR: Failed to reset nsxa app of nsx-opsagent. Please check ospagent logs for more details. , stderr: <date-time> ERROR: Failed to reset nsxa app of nsx-opsagent. Please check ospagent logs for more details."

 

  • The /var/run/log/esxupdate.log on the ESXi host shows vdl2 unload failed errors:
    cpu48:4580298)Mod: 5098: Unloading module <vmk-module-uuid> ...

    cpu48:4580298)vdl2: VDL2Cleanup:756: [nsx@6876 comp="nsx-esx" subcomp="<vmk-module-uuid>"]Starting cleanup

    cpu48:4580298)ALERT: Mod: 5251: Failed to unload module <vmk-module-uuid>, since its consumed resource count is 1. Waiting...

    cpu48:4580298)ALERT: Mod: 5280: Failed to unload module <vmk-module-uuid>, since its consumed resource count is
  • Host properties are set to true on the DVS, which can be seen by running net-dvs -l.

    com.vmware.nsx.kcp.enable

    com.vmware.nsx.spf.enabled

    com.vmware.nsx.vdl2.enabled

    com.vmware.net.portset.fc.enabled

    com.vmware.net.portset.fc.mcast.enabled
  • There is at least one host switch with the following property set to down
    net-dvs -l | grep -E "com.vmware.common.alias|com.vmware.common.opaqueDvs.status.component.vswitch"
    Example output:
    com.vmware.common.alias = <DVS name> ,        propType = CONFIG
    com.vmware.common.opaqueDvs.status.component.vswitch=down
  • /var/run/log/nsxavim.log may also show entries similar to the below on the host:
    INFO HowSwapCvds start: cvds [<DVS UUID>], retry-times 60

    INFO Nsx host properites remain in dvs [<DVS UUID>]: ['com.vmware.net.portset.fc.enabled','com.vmware.net.fc.mcast.enabled'] 

    WARNING NSX properties remain in VDS [<DVS UUID>] are not removed yet.

Environment

VMware NSX-T Data Center

Cause

This occurs when the uninstall process is unable to remove the module when certain advance configurations are applied on the host switch.

Resolution

Workaround: 
  • Check the following command on the problem host:
    • net-dvs -l | grep -E "com.vmware.common.alias|com.vmware.common.opaqueDvs.status.component.vswitch"
  • Find out which DVS still has this parameter as "down" such as 
    • com.vmware.common.alias = <DVS name> ,        propType = CONFIG
    • com.vmware.common.opaqueDvs.status.component.vswitch = down ,    propType = CONFIG
  • Set this parameter as "Up" for the DVS:
    • net-dvs -s com.vmware.common.opaqueDvs.status.component.vswitch=up -p hostPropList <DVS name>
  • Retry the uninstallation of NSX either through UI or using the following command if UI method does not work
    • nsxcli -c del nsx
  • Wait for a few minutes and check the following command and make sure there is no result returned
    • esxcli software vib list | grep -E 'nsx|vsipfwlib'
  • Reboot and re-onboard ESXi host to NSX