Host removal operation fails at the "Validate Impact for Edge Cluster" subtask due to Edge Cluster node requirements on the cluster
search cancel

Host removal operation fails at the "Validate Impact for Edge Cluster" subtask due to Edge Cluster node requirements on the cluster

book

Article ID: 406165

calendar_today

Updated On:

Products

VMware SDDC Manager

Issue/Introduction

  • Host removal task initiated from the SDDC UI fails with the following:
    Sub-Task: Validate Impact for Edge Cluster on ESXi Host Deletion
    Progress Message: Cannot proceed, removing host(s) <Host_FQDN> hosting <Number_of_hosts_in_Cluster> Edge node(s) for Edge cluster <Cluster_Name>. Remaining <Number_Remaining_hosts>  host(s) is not sufficient. Since Edge cluster <Cluster_Name> requires a minimum of <Number_of_required_Hosts> host(s)

Environment

VCF 5.x

Cause

  • As part of the host removal process, various checks are made to ensure that when the host is removed, there will be no impact on production.
  • One of these checks - "Validate Impact for Edge Cluster"- exists to ensure that the Edge node architecture that REMAINS, once the host is removed, is sufficient to ensure full network connectivity and network-load handling.
  • SDDC Manager is directly checking NSX-T and vCenter to find all Edge clusters with nodes hosted on the vCenter cluster which is to have a host removed. If any Edge cluster has more Edge nodes on it than would fit in the remaining hosts (one Edge node per host normally) then the operation is blocked.
  • This is by design.

Resolution

To resolve this issue, implement the host removal operation using Public API rather that using the UI and ensure the "forceByPassingSafeMinSize" parameter value is set to "true". Following steps need to be performed to achieve this:

  • Get the cluster ID for the cluster on which the host needs to be removed. Use the following API and make a note of the cluster ID.
    GET /v1/clusters

  • Get the host ID for the host needs to be removed. Use the following API using the cluster ID found in above step. Make a note of the Host ID.
    GET /v1/hosts

  • Use the following API to remove the host. Ensure correct cluster ID is entered the Cluster ID in the required field.
    PATCH /v1/clusters/{id}

         Use the following "CompactionSpec" information in the body. Enter the Host ID in the CompactionSpec information.

        {

          "clusterCompactionSpec": {
            "forceByPassingSafeMinSize": true,
            "hosts": [
              {
                "id": "##########################"
              }
            ]
          }
        }

 

         Refer to the following snippet for a detailed view. 

         

  • Once above API is executed successfully, ESXi hosts will be removed from the active configuration and will be moved to unassigned hosts.
  • The ESXi host then can be decommissioned from unassigned hosts.

 

Prerequisites:

  • This will only work on a host that is dead/powered down/not-responding.
  • It is important to assess the impact on the Edge architecture before proceeding - if in any doubt, open a Support Request with our NSX Support Team. To open a Support Case, see Creating and managing Broadcom support cases