Error Code: LCMVIDM74055/usr/local/bin/pcp_recovery_node -h <delegateIP> -p 9898 -U pgpool -n <node_id>ERROR: executing recovery, execution of command failed at "1st_stage"
DETAIL: command: "recovery_1st_stage"VMware Identity Manager 3.3.7
This issue occurs due to a split-brain or misconfigured state within the pgpool and PostgreSQL cluster. Specifically:
Running the poolnodes command shows that multiple nodes (or all nodes) are reporting a pg_role of primary.
The /etc/hosts files across the nodes contain incorrect or mismatched entries for the master node.
Standard recovery fails because the pcp_recovery command cannot execute properly when the pg_pool_recovery extension is missing or the nodes cannot reach the designated master.
To resolve this issue, you must manually align the database cluster roles and ensure the recovery extensions are present.
1.) Correct the hosts file on all nodes
2.) Manually Create pg_pool_recovery Extension
3.) Recover Faulty Nodes
From the primary node, run the recovery command for the faulty standby nodes
/usr/local/bin/pcp_recovery_node -h <delegateIP> -p 9898 -U pgpool -n <node_id>Verify the cluster status again using:
poolnodesConfirm that only one node is primary and all others are standby.
4.) Run an Inventory Sync from Aria Lifecycle Manager for the VIDM cluster and confirm the Health shows 'Green'