Q: Is it possible to perform PITR recovery with Patroni using pgBackrest?
A: As of pgBackrest 2.23 and Patroni 1.6.4, it is only possible to perform a PITR if you bootstrap the cluster. Normally in a single instance setup, you can stop the database and run this command:
pgbackrest --stanza=main --delta --type=time "--target=<timestamp>" --target-action=promote restore
However, in Patroni this is not possible. If you attempt to run it this way you will get recovery errors on startup. The only way to do it is to use custom bootstrap method which basically recreates the cluster. We will demonstrate this below. We will assume that patroni yaml file is setup with these parameters in the bootstrap section:
bootstrap: method: pgbackrest pgbackrest: command: /home/postgres/custom_bootstrap.sh keep_existing_recovery_conf: False no_params: False recovery_conf: recovery_target: immediate recovery_target_action: pause restore_command: pgbackrest --stanza=main archive-get %f %p
More details on the pgbackrest setup we are using can be found in this article: How to: Use pgbackrest to bootstrap or add replica to HA Patroni
postgres=# create table kevin(id int); CREATE TABLE postgres=# insert into kevin values(generate_series(1,1000)); INSERT 0 10002. Create a full backup using pgbackrest.
postgres=# select current_timestamp; current_timestamp ------------------------------- 2020-07-02 19:20:54.320877+01 (1 row)4. "Accidentally" drop the table:
postgres=# drop table kevin; DROP TABLE postgres=# checkpoint; CHECKPOINT postgres=# select * from kevin; ERROR: relation "kevin" does not exist LINE 1: select * from kevin;5. Stop Patroni nodes and remove the cluster
[postgres@postgres_node_2 ~]$ patronictl -c postgres2.yml remove patroni_backrest +------------------+-------------+--------------------+------+---------+----+-----------+ | Cluster | Member | Host | Role | State | TL | Lag in MB | +------------------+-------------+--------------------+------+---------+----+-----------+ | patroni_backrest | postgresql1 | 10.193.102.52:5532 | | stopped | | unknown | +------------------+-------------+--------------------+------+---------+----+-----------+ Please confirm the cluster name to remove: patroni_backrest You are about to remove all information in DCS for patroni_backrest, please type: "Yes I am aware": Yes I am aware​6. Change custom_bootstrap.sh and put the time we observed earlier. This is the restore point when the table still existed:
#!/bin/sh mkdir -p /home/postgres/data pgbackrest --stanza=main --log-level-console=info --delta --type=time "--target=2020-07-02 19:20:54.320877+01" --target-action=promote restore7. Move or delete the old data directory
2020-07-02 19:30:41.147 P01 INFO: restore file /home/postgres/data/base/1/13723 (0B, 100%) 2020-07-02 19:30:41.149 P01 INFO: restore file /home/postgres/data/base/1/13718 (0B, 100%) 2020-07-02 19:30:41.149 P00 INFO: write /home/postgres/data/recovery.conf 2020-07-02 19:30:41.156 P00 INFO: restore global/pg_control (performed last to ensure aborted restores cannot be started) 2020-07-02 19:30:41,263 INFO: Lock owner: None; I am postgresql1 2020-07-02 19:30:41,263 INFO: not healthy enough for leader race 2020-07-02 19:30:41,269 INFO: bootstrap in progress 2020-07-02 19:30:41.460 P00 INFO: restore command end: completed successfully (9943ms) 2020-07-02 19:30:41.924 BST [23510] LOG: listening on IPv4 address "0.0.0.0", port 5532 2020-07-02 19:30:41.924 BST [23510] LOG: listening on IPv6 address "::", port 5532 2020-07-02 19:30:41.938 BST [23510] LOG: listening on Unix socket "./.s.PGSQL.5532" 2020-07-02 19:30:41.966 BST [23510] LOG: redirecting log output to logging collector process 2020-07-02 19:30:41.966 BST [23510] HINT: Future log output will appear in directory "pg_log". 2020-07-02 19:30:41,977 INFO: postmaster pid=23510 localhost:5532 - rejecting connections localhost:5532 - rejecting connections localhost:5532 - accepting connections 2020-07-02 19:30:43,021 INFO: establishing a new patroni connection to the postgres cluster 2020-07-02 19:30:43,034 INFO: waiting for end of recovery after bootstrap9. Run pg_wal_replay_resume();
psql -h localhost -p 5532 -c 'select pg_wal_replay_resume();'10. Once you get the leader lock, the table should be back:
[postgres@postgres_node_2 ~]$ psql -h localhost -p 5532 psql (11.7 (VMware Postgres 11.7.3)) Type "help" for help. postgres=# select * from kevin; id ------ 1 2 3 4 5 6 711. Add the replicas per your normal process.