How to remove a stale host entry

Products

VMware NSX

Issue/Introduction

Unable to successfully decommission NSX Transport Nodes from the environment.
The orphaned hosts may be listed under the Standalone tab in the NSX Manager UI (System > Fabric > Hosts).
Attempting to "Delete NSX" via the Manager UI with the Force option enabled results in the decommission process hanging indefinitely at 20%.
Executing a manual DELETE request via the REST API with the parameters force=true and unprepare_host=false returns a 200 OK response, yet the backend task remains stuck at 20%.
After the process hangs at 20%, an API GET request for the Transport Node status returns a 404 Not Found error, indicating a discrepancy between the task state and the database object:
```
####.####.####.#### - "GET /nsxapi/api/v1/transport-nodes/<transport-node-UUID>/status HTTP/1.1" 404
```

The nsxapi.log confirms that the DELETE operation is initiated but fails to progress beyond the 20% completion mark.

nsxapi.5.log:900256:####.####.####249Z INFO L2HostConfigTaskExecutor5 DeploymentProgressServiceImpl FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Updating the DeploymentProgress from: DeploymentProgress [ deploymentType=HOST_TN, operationType=DELETE, progress=8, stateDescription=deployment.progress.tn.delete.waiting_for_host_config_reply, removeNsxFlag=false] to DeploymentProgress [ id=e36##########, deploymentType=HOST_TN, operationType=DELETE, progress=20, stateDescription=deployment.progress.tn.delete.waiting_for_host_config_reply, removeNsxFlag=false]

Environment

VMware NSX

Cause

The ESXi hosts were manually removed from the vCenter inventory before being decommissioned in NSX. Please refer to Best Practice for Decommissioning an NSX-Prepared ESXi Host to Avoid Stale Entries for details.

Resolution

There are two scenarios to consider to identify stale host entries in NSX Manager:

1. Stale Presence on the UI

This occurs when a host is no longer part of the vCenter inventory or is scheduled for decommissioning, and NSX has already been uninstalled. To detect such stale host entries, follow these steps:

Access the NSX management UI and navigate to the inventory or host section.
Look for any hosts that do not exist in the vCenter inventory or have NSX uninstalled.
If any such hosts are found, they can be considered stale entries and should be removed from the NSX inventory.

These host entries can be removed from the UI by selecting the host in question and clicking "Remove NSX" or using the API:

DELETE "https://localhost/api/v1/transport-nodes/<transport-node-UUID>?force=true"

2. Stale Host Entry Present Within the Database but Not in the NSX Management UI

To identify and validate stale host entries in the database, follow these steps:

From the NSX Manager root CLI, run the following command to validate the stale entries:

/opt/vmware/bin/corfu_tool_runner.py --tool corfu-editor -n nsx -o showTable -t HostTransportNode | grep "host-transport-nodes"

Return to the NSX Management UI, navigate to the inventory or host section, and ensure the number of hosts matches.

If the stale Transport node entry exists, kindly follow the KB: Installing or Upgrading NSX on an ESXi host fails reporting the node already exists resolution section to clear the stale entries.

If the issue still exists and you need assistance deleting any of the stale database entries, please open a Broadcom Technical Support case. For more information, see How to Submit a Support Request

Additional Information

If the issue persists after the above, a rolling reboot of the NSX Manager nodes may be required. After the reboots complete, the nodes may be deleted following the steps above.

NOTE: After a reboot, the orphaned nodes may now appear in the "Standalone Node" section