kubectl get pods -n <namespace>Example:
# kubectl get pods -n ns-app01-postgres NAME READY STATUS RESTARTS AGE postgres-0 5/5 Running 0 14d postgres-1 4/5 Running 0 2d20h postgres-monitor-0 4/4 Running 0 9d postgres-operator-6cdf6bd9b7-qb8qq 1/1 Running 0 7dNote: postgres-1 has 4 of 5 containers ready. This is the replica in the example that needs to be reinitialised.
kubectl get pod <replica-pod-name> -n <namespace> --show-labelsExample:
# kubectl get pod postgres-1 -n ns-app01-postgres --show-labels NAME READY STATUS RESTARTS AGE LABELS postgres-0 4/5 Running 0 27m app=postgres,controller-revision-hash=postgres-sample-84574d54c8,headless-service=postgres-sample,postgres-instance=postgres-sample,role=read,statefulset.kubernetes.io/pod-name=postgres-sample-0,type=dataNote: "role=read" means it is a replica. A primary pod would have "role=read-write".
kubectl exec -it <replica-pod-name-> -n <namespace> -c postgres-sidecar -- bashExample:
# kubectl exec -it -n ns-app01-postgres postgres-1 -c postgres-sidecar -- bash postgres@postgres-1:/$
ps -aef | egrep pg_auto kill -s STOP $(pgrep -f "pg_autoctl: start/stop postgres")
pg_ctl stop
PGPASSWORD=$(pg_autoctl config get replication.password) PG_AUTOCTL_DEBUG=true pg_autoctl do standby init <primary-pod-name>.$(hostname -d) 5432
Example:
# PGPASSWORD=$(pg_autoctl config get replication.password) PG_AUTOCTL_DEBUG=true pg_autoctl do standby init postgres-0.$(hostname -d) 5432 Defaulted container "pg-container" out of: pg-container, instance-logging, reconfigure-instance, postgres-metrics-exporter, postgres-sidecar 16:14:56 43453 INFO Initialising PostgreSQL as a hot standby 16:14:56 43453 INFO Target directory exists: "/pgsql/data", stopping PostgreSQL 16:14:57 43453 INFO /opt/vmware/postgres/15/bin/pg_basebackup -w -d 'application_name=pgautofailover_standby_0 host=postgres-1.postgres-agent.default.svc.cluster.local port=5432 user=pgautofailover_replicator ' --pgdata /pgsql/backup/ -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream 16:14:57 43453 INFO pg_basebackup: initiating base backup, waiting for checkpoint to complete 16:14:57 43453 INFO pg_basebackup: checkpoint completed 16:14:57 43453 INFO pg_basebackup: write-ahead log start point: 0/4000028 on timeline 2 16:14:57 43453 INFO pg_basebackup: starting background WAL receiver 16:14:57 43453 INFO pg_basebackup: created temporary replication slot "pg_basebackup_618400" 16:14:58 43453 INFO 38802/54859 kB (70%), 0/1 tablespace (/pgsql/backup//base/24627/14027_vm ) 16:14:58 43453 INFO 54870/54870 kB (100%), 0/1 tablespace (/pgsql/backup//global/pg_control ) 16:14:58 43453 INFO 54870/54870 kB (100%), 1/1 tablespace 16:14:58 43453 INFO pg_basebackup: write-ahead log end point: 0/4000100 16:14:58 43453 INFO pg_basebackup: waiting for background process to finish streaming ... 16:14:58 43453 INFO pg_basebackup: syncing data to disk ... 16:15:00 43453 INFO pg_basebackup: renaming backup_manifest.tmp to backup_manifest 16:15:00 43453 INFO pg_basebackup: base backup completed 16:15:01 43453 INFO Creating the standby signal file at "/pgsql/data/standby.signal", and replication setup at "/pgsql/data/postgresql-auto-failover-standby.conf" 16:15:01 43453 INFO Contents of "/pgsql/data/postgresql-auto-failover-standby.conf" have changed, overwriting 16:15:01 43453 INFO Contents of "/pgsql/data/postgresql-auto-failover.conf" have changed, overwriting 16:15:02 43453 WARN Failed to read Postgres "postmaster.pid" file 16:15:11 43453 ERROR Failed to open file "/pgsql/data/postmaster.pid": No such file or directory 16:15:11 43453 INFO Is PostgreSQL at "/pgsql/data" up and running? 16:15:11 43453 ERROR Failed to get Postgres pid, see above for details 16:15:11 43453 ERROR Failed to ensure that Postgres is running in "/pgsql/data" 16:15:11 43453 FATAL Failed to grant access to the standby by adding relevant lines to pg_hba.conf for the standby hostname and user, see above for details command terminated with exit code 4
Note: The startup of the Postgres fails. This is expected. It will be started properly in the next step.
Note: The pod specified in the command is the primary pod
kubectl delete pod -n <namespace> <replica-pod-name>Example:
# kubectl delete pod -n ns-app01-postgres postgres-1 --grace-period=0 pod "postgres-1" deletedNote: It may necessary to add the flag "--force=true" if it does not terminate.
kubectl get pods -n <namespace>Example:
# kubectl get pods -n ns-app01-postgres NAME READY STATUS RESTARTS AGE postgres-0 5/5 Running 0 14d postgres-1 5/5 Running 0 10min postgres-monitor-0 4/4 Running 0 9d postgres-operator-6cdf6bd9b7-qb8qq 1/1 Running 0 7dNote: postgres-1 has 5 of 5 containers ready now.