VCLS VMs fail to deploy in cluster with the error "duplicate key value violates unique constraint "hdcs_cluster_agencies_pkey""
search cancel

VCLS VMs fail to deploy in cluster with the error "duplicate key value violates unique constraint "hdcs_cluster_agencies_pkey""

book

Article ID: 321982

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • Following the deletion of an entry from the vpx_ext_data table within the vCenter Server database, all associated vSphere Cluster Services (vCLS) virtual machines were automatically removed as expected.
  • However, despite the removal, the vCLS VMs have not been subsequently re-created on the affected clusters.
  • Standard troubleshooting steps, including ensuring that vCLS Retreat Mode is disabled for the cluster, have not resolved the issue.
  • Additionally, restarting vCenter services has not prompted the re-creation of the vCLS VMs.
  • Upon examination of the wcpsvc.log file, specific messages are observed which indicate an ongoing impediment to the deployment of vCLS VMs as below:

    /var/log/vmware/wcp/wcpsvc.log

    YYYY-MM-DDTHH:MM:SS.###Z error wcp [eamagency/create.go:72] [opID=vCLS] Unable to create entity in db for cluster agency: ERROR: duplicate key value violates unique constraint "hdcs_cluster_agencies_pkey" (SQLSTATE 23505)

  • Also reviewing vpxd.log, the below error message may be seen.

    /var/log/vmware/vpxd/vpxd.log

    YYYY-MM-DDTHH:MM:SS.###Z-04:00 error vpxd[11060] [Originator@6876 sub=Default opID=wcp-vCLS-9b] [VdbStatement] SQLError was thrown: "ODBC error: (23505) - ERROR: duplicate key value violates unique constraint "pk_vpx_auth_tenant_mgmt_rs"

Environment

VMware vCenter Server 7.x

Cause

When an entry related to vCLS is removed from the vpx_ext_data table in the vCenter database, it correctly triggers the automatic removal of the associated vCLS agency VMs from vCenter. However, the corresponding agency entry in the hdcs.hdcs_cluster_agencies table might not be cleaned up simultaneously. This inconsistency leaves a stale record, which prevents the system from provisioning new vCLS entries or VMs for that cluster.

Resolution

Note: Before undertaking the following steps, it is strongly recommended to create a snapshot of your vCenter Server, please refer KB: VMware vCenter in Enhanced Linked Mode pre-changes snapshot (online or offline) best practice

To resolve this issue, the stale agency entry must be manually located and removed from the hdcs.hdcs_cluster_agencies table within the vCenter Server's embedded vPostgres database.

  1. Before proceeding, ensure that vCLS Retreat Mode is disabled for the affected cluster. Verify that the configuration config.vcls.clusters.domain-c<number>.enabled is set to True for your specific cluster. This is crucial for vCLS VMs to be deployed successfully.
    Refer to KB: Disable vCLS on a Cluster via Retreat Mode

  2. Establish an SSH connection to your vCenter Server Appliance.

  3. Access the vCenter Server's embedded vPostgres database. For comprehensive guidance on how to do this, please refer to KB: Interacting with the vCenter Server Appliance 6.5/6.7/7.0/8.0 embedded vPostgres Database

  4. Identify the cluster_moref for your affected cluster. This is typically in the format domain-c<number>. You can find this value in the vCenter UI (e.g., in the URL when viewing the cluster).

  5. Locate the specific cluster agency entry that needs to be removed from the database.

  6. Execute the following SQL query, replacing <domain-number-of-cluster> with the exact cluster_moref of your cluster for which the vCLS VMs are not getting created (e.g., domain-c30019):

    VCDB=# SELECT cluster_moref, agency_moref FROM hdcs.hdcs_cluster_agencies WHERE cluster_moref='<domain-number-of-cluster>';

    For example:

    VCDB=# select cluster_moref, agency_moref FROM hdcs.hdcs_cluster_agencies WHERE cluster_moref='domain-c30019'; 

    cluster_moref |             agency_moref
    ---------------+--------------------------------------
     domain-c30019 | 7f4ec319-053d-4dff-8dda-a69ee368a86e
    (1 row)

  7. Delete the cluster agency entry for the cluster:

    VCDB=# delete from hdcs.hdcs_cluster_agencies where cluster_moref='<domain-number-of-cluster>';

    The <domain-number-of-cluster> is same as from step 4.

  8. Verify the deletion by re-running the SELECT query from Step 6. The entry should no longer appear in the query results, indicating successful removal as below:

cluster_moref | agency_moref
---------------+--------------
(0 rows)

Once the cluster agency entry is deleted from hdcs.hdcs_cluster_agencies, the vCLS VMs should be immediately re-created.