NSX Edge Cluster removal fails in SDDC Manager and remains in "DEACTIVATING" state due to active Tier-0 Gateway dependencies
search cancel

NSX Edge Cluster removal fails in SDDC Manager and remains in "DEACTIVATING" state due to active Tier-0 Gateway dependencies

book

Article ID: 428690

calendar_today

Updated On:

Products

VMware SDDC Manager

Issue/Introduction

  • Attempts to run the VCF NSX-T Edge Cluster Deployment Removal Tool fail with errors related to the Tier-0 Gateway.
    Delete attempt failed for VCF-edge_########## /policy/api/v1/infra/segments/VCF-edge_##########:
    {'error_code': 500030,
     'error_message': 'The object '
                      'path=[/infra/segments/VCF-edge_##########] '
                      'cannot be deleted as either it has children or it is being '
                      'referenced by other objects '
                      'path=[/infra/tier-0s/##########/locale-services/default/interfaces/####################,/infra/tier-0s/##########/locale-services/default/interfaces/####################,/infra/tier-0s/##########/locale-services/default/interfaces/#######################,/infra/tier-0s/##########/locale-services/default/interfaces/##################################,/infra/tier-0s/##########/locale-services/default/interfaces/#########################,/infra/tier-0s/##########/locale-services/default/interfaces/######################,/infra/tier-0s/##########/locale-services/default/interfaces/#########################,/infra/tier-0s/##########/locale-services/default/interfaces/##########################,/infra/tier-0s/##########/locale-services/default/interfaces/#############################,/infra/tier-0s/##########/locale-services/default/interfaces/############################]',
     'httpStatus': 'BAD_REQUEST',
     'module_name': 'Policy'}
    Cleaner is stopping now.
    Log written to /home/vcf/cleanup/cleaner/edge_cluster_cleaner_#######.log
  • Validation errors highlight specific NSX constructs, such as Prefix Lists or Multi-tenancy Project references, that are actively in use and cannot be automatically deleted. 
  • As a result, the NSX Edge Cluster gets stuck in a "DEACTIVATING" state in the SDDC UI.

Environment

VMware SDDC Manager 5.x

Cause

This issue is typically caused by a synchronization mismatch between the SDDC Manager inventory and the NSX Management plane, often resulting from one of the following factors:

  • Active Production Dependencies: The Tier-0 Gateway associated with the Edge Cluster contains active references (e.g., Prefix Lists, BGP neighbors, or Multi-tenancy Project links) that are still in use. The automated removal workflows are designed to fail safely rather than force-delete active routing configurations.

  • Stale Database Entries: When the initial removal attempt fails mid-workflow (e.g., due to the dependencies mentioned above), the SDDC Manager Postgres database may not roll back correctly. It retains records of the cluster in a "deactivating" state, preventing subsequent removal attempts or host decommissioning.

Resolution

To resolve this issue, blocking dependencies in NSX must be manually cleared, and stale cluster records must be forcibly removed from the SDDC Manager database.

Prerequisites:

  • Root access to the SDDC Manager appliance is required.

  • Admin access to the NSX Manager UI is required.

  • Critical: A valid backup or snapshot of the SDDC Manager VM must be ensured before database modifications.

Step 1: Standard Removal Attempt & Dependency Identification

  1. Download and execute the script attached to Broadcom KB VCF NSX-T Edge Cluster Deployment Removal Tool.

  2. Monitor the script output. If the script completes successfully, stop here; no further action is required.

  3. Note specific error messages if the script fails on the Tier-0 Gateway. Identify which objects (e.g., Prefix Lists or Project References) are blocking the deletion.

Step 2: Manual NSX Cleanup

  1. Log in to the NSX Manager UI.

  2. Navigate to the specific objects identified in the previous error logs (e.g., Networking > Routing > Tier-0 Gateways or Inventory > Groups/Prefix Lists).

  3. Manually remove only the stale or unused references identified by the script.

[IMPORTANT] Do not delete the Tier-0 Gateway if it is currently handling active production traffic for other workloads.

Step 3: SDDC Manager Database Cleanup

If the Edge Cluster remains in a "DEACTIVATING" state after you clear the NSX dependencies, follow these steps to clear the SDDC Manager inventory:

  1. Establish an SSH session to the SDDC Manager appliance using the vcf user, then switch to root.

  2. Take a fresh snapshot of the SDDC Manager VM.

  3. Access the Postgres database by running: psql -h localhost -U postgres

  4. Identify the ID of the stuck Edge Cluster by running the following query: select * from nsxt_edge_cluster;

  5. Locate the entry matching the cluster name stuck in "DEACTIVATING" state and copy its ID.

  6. Delete stale records matching that ID from the following tables:

    • delete from cluster_and_nsxt_edge_cluster where nsxt_edge_cluster_id = '<IMPACTED_CLUSTER_ID>';

    • delete from nsxt_edge_cluster_and_nsxt_cluster where nsxt_edge_cluster_id = '<IMPACTED_CLUSTER_ID>';

    • delete from nsxt_edge_cluster where id = '<IMPACTED_CLUSTER_ID>';

  7. Exit the database prompt by typing: \q.

Step 4: Service Restart and Verification

  1. Restart SDDC Manager services to refresh the UI inventory by executing:  /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh

  2. Wait 5–10 minutes for all services to initialize fully.

  3. Log in to the SDDC Manager UI.

  4. Verify that the stuck NSX Edge Cluster no longer appears in the inventory.

  5. Proceed with the standard Host Decommissioning workflow via SDDC Manager as required.