We will take a look at a 2-node formation with the following setup:
[postgres@autofailover-m ~]$ pg_autoctl show state
Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
-------+-------+---------------------+-----------+-----------+---------------------+--------------------
node_1 | 1 | autofailover-1:5432 | 0/A000230 | yes | primary | primary
node_2 | 2 | autofailover-2:5432 | 0/A000230 | yes | secondary | secondary
And a 3-node formation with the following setup:
[postgres@autofailover1 ~]$ pg_autoctl show state
Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
-------+-------+--------------------+------------+-----------+---------------------+--------------------
node_1 | 1 | autofailover2:5432 | 0/13000148 | yes | primary | primary
node_3 | 3 | autofailover4:5432 | 0/13000148 | yes | secondary | secondary
node_5 | 5 | autofailover3:5432 | 0/13000148 | yes | secondary | secondary
When only the primary remains in a 2-node setup, the primary will allow writes as stated in the documentation.
Secondary node is monitored unhealthy
When the secondary node is unhealthy, the monitor assigns to it the state
CATCHINGUP and assigns the state
WAIT_PRIMARY to the primary node. When implementing the transition from
PRIMARY to
WAIT_PRIMARY, the keeper disables synchronous replication.
When the keeper reports an acceptable WAL difference in the two nodes again, the replication is upgraded back to being synchronous. While a secondary node is not in the
SECONDARY state, secondary promotion is disabled.
However, in a 3+node setup, a different behavior is observed where synchronous replication is not automatically disabled by the monitor and writes to the primary will give the following:
[postgres@autofailover2 ~]$ psql
psql (13.3 (VMware Postgres 13.3.0))
Type "help" for help.
postgres=# create table test1 (id int);
^CCancel request sent
WARNING: canceling wait for synchronous replication due to user request
DETAIL: The transaction has already committed locally, but might not have been replicated to the standby.
CREATE TABLE
This can give the appearance that production is down when in fact this is expected behavior.
2 - Node Formation
If we bring down the secondary, only the primary will be left and the monitor will give the following states:
[postgres@autofailover-m ~]$ pg_autoctl show state
Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
-------+-------+---------------------+-----------+-----------+---------------------+--------------------
node_1 | 1 | autofailover-1:5432 | 0/A015178 | yes | wait_primary | wait_primary
node_2 | 2 | autofailover-2:5432 | 0/A000230 | no | secondary | catchingup
And the Postgres primary will allow writes:
[postgres@autofailover-1 ~]$ psql
psql (12.7 (VMware Postgres 12.7.0))
Type "help" for help.
postgres=# create table test (id int);
CREATE TABLE
3 - Node Formation
If we bring down all the secondaries, only the primary will be left and the monitor will give the following states:
[postgres@autofailover1 ~]$ pg_autoctl show state
Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
-------+-------+--------------------+------------+-----------+---------------------+--------------------
node_1 | 1 | autofailover2:5432 | 0/13000148 | yes | primary | primary
node_3 | 3 | autofailover4:5432 | 0/13000148 | no | secondary | catchingup
node_5 | 5 | autofailover3:5432 | 0/13000148 | no | secondary | catchingup
Notice that the primary is never updated to "
wait_primary". This means the primary is waiting for an available secondary to continue replication before it allows writes to the primary. Any attempt to write to the primary will be met with a hanging psql prompt. The
pg_auto_failover docs describe the difference in multi-standby setups:
The default value for
number_sync_standbys GUC in
pg_auto_failover is (
number_sync_standbys +1). So in this setup, the value is
1 by default which accounts for synchronous replication not being disabled automatically by the monitor when only the primary is left.
How to Allow Writes to the Primary
You can set
number_sync_standbys = 0 to disable this behavior if none of the secondaries are going to be available:
[postgres@autofailover1 ~]$ pg_autoctl set formation number-sync-standbys 0
22:37:20 23160 INFO Waiting for the settings to have been applied to the monitor and primary node
22:37:20 23160 INFO New state is reported by node 1 "node_1" (autofailover2:5432): "apply_settings"
22:37:20 23160 INFO Setting goal state of node 1 "node_1" (autofailover2:5432) to wait_primary because none of the 2 standby candidate nodes are healthy at the moment.
22:37:20 23160 INFO New state is reported by node 1 "node_1" (autofailover2:5432): "wait_primary"
22:37:20 23160 INFO primary node has now set synchronous_standby_names = ''
0
[postgres@autofailover1 ~]$ pg_autoctl show state
Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
-------+-------+--------------------+------------+-----------+---------------------+--------------------
node_1 | 1 | autofailover2:5432 | 0/13010BF8 | yes | wait_primary | wait_primary
node_3 | 3 | autofailover4:5432 | 0/13000148 | no | secondary | catchingup
node_5 | 5 | autofailover3:5432 | 0/13000148 | no | secondary | catchingup
After changing the config, you'll see the primary state updates to "
wait_primary" and writes to the database will resume.
Note: You will need to bring up a secondary to re-enable the default behavior