ESXI host disconnects after following RESOLVE action for "host install failed" error in NSX
search cancel

ESXI host disconnects after following RESOLVE action for "host install failed" error in NSX

book

Article ID: 414996

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Host transport node in install failed state
  • Error message shows that installation is stuck at "Installing NSX" stage

  • You may see error logs similar to the following;

2025-10-08T06:35:35.957Z XXXXNSX01.XXXX.co.in NSX 82495 FABRIC [nsx@6876 comp="nsx-manager" errorCode="MP26110" level="ERROR" subcomp="manager"] Install of offline bundle on host XXXXNSX01.XXXX.co.in - xxxxxxxx-f7aa-4634-xxxx-xxxxxxxxxxxx threw RemoteException with message: VI SDK invoke exception:java.rmi.RemoteException: VI SDK invoke exception:org.dom4j.DocumentException
2025-10-08T06:35:35.957Z XXXXNSX01.XXXX.co.in NSX 82495 FABRIC [nsx@6876 comp="nsx-manager" errorCode="MP26019" level="ERROR" subcomp="manager"]  Install/Upgrade on ESX - VIB install failed on host XXXXNSX01.XXXX.co.in - xxxxxxxx-f7aa-4634-xxxx-xxxxxxxxxxxx with error [Ljava.lang.StackTraceElement;@4e950992

2025-10-08T06:35:36.287Z XXXXNSX01.XXXX.co.in NSX 82495 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Updating the deploymentProgressState for deploymentUnitInstance: DeploymentUnitInstance [ id=DeploymentUnitInstance/xxxxxxxx-aaf5-4886-ab4f-xxxxxxxxxxxx, deploymentUnitId=DeploymentUnit/xxxxxxxx-9929-4dd2-xxxx-4a40xxxxxxxx, hostId=HostTransportNode/xxxxxxxx-f7aa-4634-xxxx-xxxxxxxxxxxx, entityId=null, prevEntityId=null, runningVersion=null, deploymentProgressState=INSTALL_FAILED, deploymentGoalState=ENABLED, internalLastKnownOSVersion=8.0.3, agentId=null, errorId=26050, errorMessage=Failed to install software on host. XXXXNSX01.XXXX.co.in : java.rmi.RemoteException: VI SDK invoke exception:java.rmi.RemoteException: VI SDK invoke exception:org.dom4j.DocumentException] to INSTALL_FAILED:Failed to install software on host. XXXXNSX01.XXXX.co.in : java.rmi.RemoteException: VI SDK invoke exception:java.rmi.RemoteException: VI SDK invoke exception:org.dom4j.DocumentException.

  • You attempted to fix the issue using RESOLVE action on the error message.
  • After progressing 68% of the workflow, you have observed that host got disconnected from NSX and vCenter and host management IP (vmk0) is found not reachable on the network.

  • Output of "esxcfg-vswitch -l" run on the impacted host returns error similar to the error below:

    #esxcfg-vswitch -l
    DVS Name         Num Ports   Used Ports  Configured Ports  MTU     Uplinks
    Listing failed for DVSwitch: DvsPortset-0, Error: Unable to get the dvs name: Status(bad0007)= Bad parameter

Environment

VMware NSX-T Datacenter

VMware NSX

Cause

This issue is caused by stale NSX properties left in ESXi host's distributed switch configuration (e.g. kcp), which will cause a conflict when the host is being re-prepared for NSX. Due to this conflict, ESXi is unable to install the properties and set a correct switch type, which will result in switch type failing to nulldev device. 

Resolution

Verify if the ESXI host's management vmk0 interface is on the switch used for NSX.

This issue is resolved in VMware NSX 4.2.2 available at Broadcom Downloads.

If you are having difficulty finding and downloading software, please refer Download Broadcom products and software.

To recover connectivity of the ESXi host once the issue has occurred, host's management network needs to be re-connected (usually via standard switch). For more information: Recover ESXi host connectivity when management is on DVS

To prevent this issue, if the ESXi host was previously prepared for NSX and is being prepared for NSX again, you can confirm configuration of the distributed switch and remove the stale entries before re-preparing the host for NSX:

  1. SSH to the ESXi host. 
  2. Confirm stale properties are present on the switch (which is currently not being used by NSX):
    # net-dvs -l | grep "common.alias\|kcp.enable\|vdsSecurity.enabled"
  3. If the properties in the output are set to "true" (sample below), proceed to the next step:
                    com.vmware.nsx.kcp.enable = true ,      propType = CONFIG
  4. To disable these properties:
    # net-dvs -u com.vmware.nsx.kcp.enable -p hostPropList <dvsName>
    # net-dvs -u com.vmware.nsx.vdsSecurity.enabled -p hostPropList <dvsname>
    Note the <dvsName> is in output of command run in step b. (common.alias).
  5. The host is now ready to be prepared for NSX. 

Additional Information

ESXi host lost network connectivity during preparation for NSX

Recover ESXi host connectivity when management is on DVS