Error: "Hosts Version UNKNOWN" after host fails NSX upgrade pre check
search cancel

Error: "Hosts Version UNKNOWN" after host fails NSX upgrade pre check

book

Article ID: 322413

calendar_today

Updated On:

Products

VMware NSX VMware Telco Cloud Platform

Issue/Introduction

  • During the NSX upgrade, the hosts pre checks gives you a warning "Hosts Version UNKNOWN" indicating a host or multiple hosts are unknown version which indicates there are stale or inconsistent entries in the database.
  • The hosts host1.example.com and host2.example.com were previously in the inventory and prepared for NSX, but have been removed from vCenter and unprepared for NSX.
  • Searching with the host name in the NSX elastic search shows the host still listed with a transport node UUID.
  • The error message might look like below for a specific host during Pre-Upgrade check.

Pre-upgrade checks failed for HOST: [UC] Error in rest call. url= /nsxapi/api/v1/transport-nodes/####-####-####-####-########/status , method= GET , response= { "httpStatus" : "NOT_FOUND", "error_code" : 29501, "module_name" : "HeatMap", "error_message" : "Transport node ####-####-####-####-######## not found" } , error= 404 : "{<EOL> "httpStatus" : "NOT_FOUND",<EOL> "error_code" : 29501,<EOL> "module_name" : "HeatMap",<EOL> "error_message" : "Transport node ####-####-####-####-######## not found"<EOL>}

  • Using the UUID to search for the node with the following API call show it is still trying to be unprepared:

GET https://{{MPIP}}/api/v1/transport-nodes/<Transport-Node-UUID>/state

    "node_deployment_state": {
        "state": "failed",
        "details": [
            {
                "sub_system_id": "e3e12eca-####-####-####-########",
                "state": "failed",
                "failure_message": "Failed to uninstall the software on host. Unable to connect the host https://10.#.#.#/sdk.\n",
                "failure_code": 26020
            }
        ]
    },
    "deployment_progress_state": {
        "progress": 46,
        "current_step_title": "Removing NSX bits"
  • An ESXi host on NSX manager shows Orphaned state and  keep retrying NSX uninstallation



Environment

  • TCP: 3.x
  • NSX-T: 3.1.x, 3.2.x

Cause

  • When the host was removed from vCenter and then unprepared for NSX, it was not able to cleanly remove the NSX VIBs and remains as an orphaned host.

Resolution

This is a known issue impacting the NSX Data Center.

Workaround:

Remove the decommissioned or stale host

  1. Make sure the host is no longer in use.
  2. Use UUID discovered for the host with the following API call to remove the host from NSX:
    DELETE https://{{MPIP}}/api/v1/transport-nodes/<Transport-Node-UUID>?force=true&unprepare_host=false
  3. Wait for approx. 5 minutes, this will provide time for the cleanup to occur and then use the GET API to confirm it is no longer present in NSX:
    GET https://{{MPIP}}/api/v1/transport-nodes/<Transport-Node-UUID>/state
  4. This should return an object not found result once the node is removed.
  5. Re-Run the Pre-Checks for hosts to proceed with upgrade.
  •