Stale host entries in SDDC postgres database
search cancel

Stale host entries in SDDC postgres database

book

Article ID: 436010

calendar_today

Updated On:

Products

VMware SDDC Manager / VCF Installer

Issue/Introduction

Workload domain imports fail.

The customer deleted the workload domain from the VCF Ops then deleted the vCenter/reinstalled the ESXi hosts to desired config.

The vCenter and ESXi hosts have the same name as before. 

Import now using the same named vCenter/ESXi hosts, they are seeing errors during the import:

  • Error in /var/log/vmware/vcf/commonsvcs/vcf-commonsvcs.log
    ERROR [common,######-###-###-###-############ ,39##] [c.v.e.s.i.s.EsxiInventoryServiceImpl,http-nio-##########-###-92] ESXi <hostaname>already exist: ######-###-###-###-############  <hostname> already exists
    ERROR [common,######-###-###-###-############ ,39##] [c.v.e.s.i.s.e.DomainExtensionsImpl,http-nio-##########-###-92] Failed to post domain inventory com.vmware.evo.sddc.inventory.model.InventoryAlreadyExistsException: ######-###-###-###-############  <hostname> already exists

Environment

VCF 9.0.1

Cause

The host had previously imported, then removed and then they tried to add then back in.

They had taken the correct steps to delete the WLD cluster yet there were stale entries in the postgres database.

The hosts that were removed were still showing in the postgres database.

Resolution

  1. Take a snapshot of the SDDC Manager VM using vCenter
  2. SSH into the SDDC Manager using the vcf user and then su root
  3. Connect to the database:
    • psql --host=localhost -U postgres -d platform
  4. Identify the ID for the host to be removed:
      • select id, hostname, status from host where hostname='<FQDN>';
      • If the FQDN is not known, then 
        • select id, hostname, status from host where status='ERROR';
        • select id, hostname, status from host where status='DEACTIVATING';
      • If there are multiple ESXi hosts and they are in an error or deactivating status, this can be a better process.
  5.  If you do not get an output of 'ERROR' or 'DEACTIVATING' then go onto step 6.   If you get an output for 'ERROR' or 'DEACTIVATING'  then go to the following KB to work through that process:  Unable to remove ESXi host from VCF SDDC manager due to database mismatch | Steps to remove ESXi Hosts from the SDDC Manager DB
  6.  To see the hosts that the postgres shows, run the following:
    • select * from host;
  7.  To see the hosts id and hostname run the following:
    • select id, hostname, management_ip_address from host;
  8.  To remove the hosts that are possibly in a stale state then run the following:
    • delete from host where hostname='hostname of host that is presumed stale';
  9.  Restart all services, wait a few minutes and then retry the task.