PostgreSQL PCP recovery fails in a VMware Identity Manager environment during patching or cluster remediation.
Standby nodes appear as DOWN in the pgpool status output.
Running the PCP recovery command fails with an error similar to the following:/usr/local/bin/pcp_recovery_node -h <delegateIP> -p 9898 -U pgpool -n <node_id>
Execution of command failed at 1st stage - recovery_1st_stage
The server.log file reports missing function errors, such as:
PostgreSQL, pgService, and Horizon Workspace services remain unstable.
The Load Balancer returns 502 errors, and the OpenSearch service is unavailable.
Only the primary node remains operational, preventing patching or remediation from proceeding.
VMware Identity Manager 3.3.7
The PCP recovery process fails due to one or both of the following reasons:
Incorrect file ownership of the recovery stage file (/db/data/recovery_1st_stage) — the file is owned by root instead of postgres.
Missing PostgreSQL extension pgpool_recovery, which prevents pgpool from executing the required recovery function.
Log in to the affected node and update the ownership of the recovery stage file as shown below:
Change the ownership from root to postgres:users.
Create the pgpool_recovery extension manually if it does not exist on the Primary node:
/opt/vmware/vpostgres/current/bin/psql -h localhost -U postgres -d template1 \ -c "CREATE EXTENSION IF NOT EXISTS pgpool_recovery WITH SCHEMA pg_catalog;"Restart the PostgreSQL and pgpool services to apply changes:
/etc/init.d/vpostgres restart/etc/init.d/pgService restartCheck the pgpool node status:
/usr/local/bin/pcp_node_info -h <delegateIP> -p 9898 -U pgpool -n allAll nodes should now show as UP.
Re-run the PCP recovery process on the primary node:
/usr/local/bin/pcp_recovery_node -h <delegateIP> -p 9898 -U pgpool -n <node_id>Note: The command should now complete successfully.
Ensure all services are stable and the environment is synchronized:
/etc/init.d/vpostgres status/etc/init.d/pgService status/etc/init.d/opensearch statusRun an inventory sync in LCM to validate connectivity.