Node in VMware Identity Manager will not start Postgres.
search cancel

Node in VMware Identity Manager will not start Postgres.

book

Article ID: 401750

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

Checking the status of postgres or trying to restart it leads to below error about the /db/data directory:

/etc/init.d/vpostgres status
/etc/init.d/vpostgres restart
pg_ctl: directory "/db/data" is not a database cluster directory

The /db/data directory is empty or contains only the recovery.conf or serverlog files.

Environment

VMware Identity Manager 3.3.7

Cause

The Postgres service fails to start on the replica nodes because the necessary database configuration files were missing. This is directly caused by the /db/data directories being empty on the two standby nodes, which breaks the database cluster and brings down the VIDM UI.

Resolution

  1. Stop the pgPool service, starting with the nodes first and the primary last:

    /etc/init.d/pgService stop
  2. On the replica nodes, remove any files under /db/data. Do not remove any files on the primary node.

    rm -rf /db/data/*
  3. This action triggers a pg_basebackup that copies data from the primary to the nodes. Wait for about 5 minutes to check if /db/data is populated.

  4. If there is no data, skip to step 6 to run a pg_basebackup manually to copy data over to the node. If data is populated, start vpostgres on the replica nodes:

    /etc/init.d/vpostgres start
  5. Start up pgpool, starting with the master and then the replica nodes:

    /etc/init.d/pgService start
  6. Run a pg_basebackup to copy files to /db/data on the replica nodes:

    /opt/vmware/vpostgres/current/bin/pg_basebackup -h master -D /db/data -U repl -P -Xs

    This prompts for a password. You can find the password on the primary node in the /db/data/recovery.done file in the password= field.

  7. Check the /db/data directory on the replica nodes after the pg_basebackup has completed.

  8. Validate the build version:

    grep "SAAS VERSION" /opt/vmware/horizon/workspace/logs/*.log | tail -1

    If version 3.3.7.0 Build 25163938 or Build 24966008, skip to step 10.

  9. Make sure the file recovery.conf exists; if not, copy the file recovery.done to recovery.conf:

    cp /db/data/recovery.done /db/data/recovery.conf
  10. Make sure all files in the /db/data directory are owned by postgres:users. If not, run:

    chown -R postgres:users /db/data
  11. Return to steps 4 and 5 to start vpostgres and pgPool.