Replicating Postgres database with pg_autoctl fails with "pg_autoctl service node-init failed with exit status 0" on pg_autofailover
search cancel

Replicating Postgres database with pg_autoctl fails with "pg_autoctl service node-init failed with exit status 0" on pg_autofailover

book

Article ID: 296401

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

pg_auto_failover includes the command line tool pg_autoctl that implements many commands to manage your Postgres nodes.


When trying to create a replica database, it fails with "pg_autoctl service node-init failed with exit status 0." 

For example:

$ pg_autoctl create postgres --name pgauto02 --auth md5 --ssl-self-signed --pgdata $PGDATA --monitor postgres://autoctl_node:changeme@pgauto00/pg_auto_failover?sslmode=prefer    --dbname postgres run
09:25:47 4297 INFO  Using default --ssl-mode "require"
09:25:47 4297 INFO  Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
09:25:47 4297 WARN  Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
09:25:47 4297 WARN  See https://www.postgresql.org/docs/current/libpq-ssl.html for details
09:25:47 4297 INFO  Started pg_autoctl postgres service with pid 4300
09:25:47 4297 INFO  Started pg_autoctl node-init service with pid 4301
09:25:47 4300 INFO   /usr/bin/pg_autoctl do service postgres --pgdata /var/lib/pgsql/data -v
09:25:47 4301 INFO  Registered node 13 (pgauto02:5432) with name "pgauto02" in formation "default", group 0, state "wait_standby"
09:25:47 4301 INFO  Writing keeper state file at "/var/lib/pgsql/.local/share/pg_autoctl/var/lib/pgsql/data/pg_autoctl.state"
09:25:47 4301 INFO  Writing keeper init state file at "/var/lib/pgsql/.local/share/pg_autoctl/var/lib/pgsql/data/pg_autoctl.init"
09:25:47 4301 INFO  Successfully registered as "wait_standby" to the monitor.
09:25:47 4301 INFO  FSM transition from "init" to "wait_standby": Start following a primary
09:25:47 4301 INFO  Transition complete: current state is now "wait_standby"
09:25:47 4301 INFO  New state for node 5 "pgauto01" (pgauto01:5432): single ➜ wait_primary
09:25:47 4301 INFO  New state for node 5 "pgauto01" (pgauto01:5432): wait_primary ➜ wait_primary
09:25:47 4301 INFO  Still waiting for the monitor to drive us to state "catchingup"
09:25:47 4301 WARN  Please make sure that the primary node is currently running `pg_autoctl run` and contacting the monitor.
09:25:47 4301 INFO  FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
09:25:47 4301 INFO  Initialising PostgreSQL as a hot standby
09:25:47 4301 INFO   /usr/bin/pg_basebackup -w -d application_name=pgautofailover_standby_13 host=pgauto01 port=5432 user=pgautofailover_replicator sslmode=require --pgdata /var/lib/pgsql/backup/node_6 -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream --slot pgautofailover_standby_13
09:25:47 4301 INFO  pg_basebackup: initiating base backup, waiting for checkpoint to complete
09:25:48 4301 INFO  pg_basebackup: checkpoint completed
09:25:48 4301 INFO  pg_basebackup: write-ahead log start point: 0/C000028 on timeline 1
09:25:48 4301 INFO  pg_basebackup: starting background WAL receiver
09:25:48 4301 INFO      0/25006 kB (0%), 0/1 tablespace (...pgsql/backup/node_6/backup_label)
09:25:48 4301 INFO  25015/25015 kB (100%), 0/1 tablespace (.../backup/node_6/global/pg_control)
09:25:48 4301 INFO  25015/25015 kB (100%), 1/1 tablespace                                         
09:25:48 4301 INFO  pg_basebackup: write-ahead log end point: 0/C000100
09:25:48 4301 INFO  pg_basebackup:
09:25:48 4301 INFO   waiting for background process to finish streaming ...
09:25:48 4301 INFO  pg_basebackup: syncing data to disk ...
09:25:48 4301 INFO  pg_basebackup: renaming backup_manifest.tmp to backup_manifest
09:25:48 4301 INFO  pg_basebackup: base backup completed
09:25:48 4297 WARN  pg_autoctl service node-init failed with exit status 0
09:25:48 4300 INFO  Postgres controller service received signal SIGTERM, terminating
09:25:48 4297 INFO  Stop pg_autoctl


Environment

Product Version: 13.2

Resolution

Verify the following:
  • The $PGDATA directory is not a mountpoint.
  • The directory can be deleted and recreated by the pg_autofailover user and Postgres.
  • The pg_basebackup needs to be able to remove the $PGDATA directory completely and to be able to recreate it.

Workaround

If it is a mountpoint for a filesystem, then change $PGDATA to be a directory in the mounted filesystem.

For example:
  • If a filesystem is mounted to /var/lib/pgsql/data.
  • If $PGDATA is set to /var/lib/pgsql/data.
  • Then change $PGDATA to be /var/lib/pgsql/data/data.