repmgr shows the cell(s) as failed when using:sudo -i -u postgres repmgr cluster show
systemctl status vpostgres.service
/var/vmware/vpostgres/current/pgdata/log show the following errors: LOG: could not accept SSL connection: tlsv1 alert internal errorFATAL: the database system is shutting downLOG: database system is shut down
10.x, 10.3.3
This issue can occur if the cells(s) lose network connectivity causing the postgres service to fail.
Obtain the ID of the failed cell.
# sudo -i -u postgres repmgr cluster show
Un-register the failed cell from the Repmgr Cluster.
# sudo -i -u postgres repmgr standby unregister -f /opt/vmware/vpostgres/current/etc/repmgr.conf --node-id=xxxxx
Delete the Postgres Database from the Failed Standby Cell.
Stop the vCD Services.
# service vmware-vcd stop
Stop the Postgres Service.
# systemctl stop vpostgres.service
Delete the Postgres Database from the failed standby cell.
# rm -rf /var/vmware/vpostgres/current/pgdata
Copy the Primary Cell Database to the Standby Cell.
# sudo -i -u postgres repmgr -h <primary_database_ip> -U repmgr -d repmgr -f /opt/vmware/vpostgres/current/etc/repmgr.conf standby clone
Start Postgres Service.
# systemctl start vpostgres.service
Register the Standby Cell with the Repmgr Cluster.
# sudo -i -u postgres repmgr -h <primary_database_ip> -U repmgr -d repmgr -f /opt/vmware/vpostgres/current/etc/repmgr.conf standby register --force
Verify the Standby Cell is registered to the Repmgr Cluster Successfully and you no longer see the error message or replication lag with the Standby Cell.
# sudo -i -u postgres repmgr cluster show
Start the vCD Services.
# service vmware-vcd start