The following scenario where the database is shutdown and cannot be started on the normal coordinator can result in the standby coordinator being started in utility mode after a forced activation of the standby:
[gpadmin@cdw ~]$ gpstate -e
20250521:10:51:40:439052 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20250521:10:51:40:439052 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.3.3 build commit:ce20fc237ed7520a2476c96ed7d9edddea136932'
20250521:10:51:40:439052 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.3.3 build commit:ce20fc237ed7520a2476c96ed7d9edddea136932) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit compiled on Dec 18 2024 05:34:04 Bhuvnesh C.'
20250521:10:51:40:439052 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20250521:10:51:40:439052 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20250521:10:51:41:439052 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20250521:10:51:41:439052 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20250521:10:51:41:439052 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20250521:10:51:41:439052 gpstate:cdw:gpadmin-[INFO]:-All segments are running normally
[gpadmin@cdw ~]$ gpstate -f
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -f
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.3.3 build commit:ce20fc237ed7520a2476c96ed7d9edddea136932'
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.3.3 build commit:ce20fc237ed7520a2476c96ed7d9edddea136932) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit compiled on Dec 18 2024 05:34:04 Bhuvnesh C.'
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:-Standby coordinator details
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:-----------------------
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:- Standby address = scdw
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:- Standby data directory = /data/coordinator/gpseg-1
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:- Standby port = 5432
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:- Standby PID = 3387966
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:- Standby status = Standby host passive
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:--------------------------------------------------------------
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:--pg_stat_replication
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:--------------------------------------------------------------
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:--WAL Sender State: streaming
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:--Sync state: sync
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:--Sent Location: 0/A0000060
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:--Flush Location: 0/A0000060
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:--Replay Location: 0/A0000060
20250521:10:51:45:439094 gpstate:cdw:gpadmin-[INFO]:--------------------------------------------------------------
[gpadmin@cdw ~]$ gpstop -af
[gpadmin@cdw ~]$ gpactivatestandby -f
:
:
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[DEBUG]:-Check if Coordinator is already running...
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[WARNING]:-****************************************************************************
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[WARNING]:-Coordinator-only start requested. If a standby is configured, this command
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[WARNING]:-may lead to a split-brain condition and possible unrecoverable data loss.
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[WARNING]:-Maintenance mode should only be used with direction from Greenplum Support.
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[WARNING]:-****************************************************************************
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[DEBUG]:-Running Command: $GPHOME/sbin/gpconfig_helper.py --file /data/coordinator/gpseg-1/postgresql.conf --get-parameter gp_segment_configuration_file
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[INFO]:-Starting Coordinator instance in admin mode
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1 -l /data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -c gp_role=utility " start
20250521:11:11:43:3394218 gpstart:scdw:gpadmin-[DEBUG]:-Running Command: env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1 -l /data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -c gp_role=utility " start
20250521:11:11:44:3394218 gpstart:scdw:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20250521:11:11:44:3394218 gpstart:scdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20250521:11:11:44:3394218 gpstart:scdw:gpadmin-[DEBUG]:-Connecting to db template1 on host localhost
20250521:11:11:44:3394218 gpstart:scdw:gpadmin-[ERROR]:-gpstart failed. exiting...
Traceback (most recent call last):
File "/usr/local/greenplum-db-7.3.3/lib/python/gppylib/mainUtils.py", line 361, in simple_main_locked
exitCode = commandObject.run()
File "/usr/local/greenplum-db-7.3.3/bin/gpstart", line 120, in run
self._startCoordinator()
File "/usr/local/greenplum-db-7.3.3/bin/gpstart", line 435, in _startCoordinator
self.gparray = GpArray.initFromCatalog(self.dburl, utility=True)
File "/usr/local/greenplum-db-7.3.3/lib/python/gppylib/gparray.py", line 990, in initFromCatalog
with closing(dbconn.connect(dbURL, utility)) as conn:
File "/usr/local/greenplum-db-7.3.3/lib/python/gppylib/db/dbconn.py", line 238, in connect
conn = psycopg2.connect(**conninfo)
File "/usr/lib64/python3.6/site-packages/psycopg2/__init__.py", line 130, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL: the database system is starting up
DETAIL: last replayed record at 0/A0000208
'
stderr=''
The standby coordinator is now running on utility mode:
[gpadmin@scdw ~]$ ps -aef | egrep -- -D
gpadmin 3394230 1 0 11:11 ? 00:00:00 /usr/local/greenplum-db-7.3.3/bin/postgres -D /data/coordinator/gpseg-1 -c gp_role=utility
Greenplum 7.x
There was an upstream(postgres) change made command "pg_ctl start -w" to rely on the postmaster status in "postmaster.pid" file to know if postmaster is started or not.
The status to wait for can be either "ready" or "standby" (in upstream). Both of the status would make "pg_ctl start -w" return.
This status is not cleared after server shutdown. So, if a standby server was once a standby, but later becomes a primary, its existing "postmaster.pid" file would already contain "standby" status. So "pg_ctl start -w" would not wait.
This makes "gpstart" query the server while it's still recovering, hence the error "FATAL: the database system is starting up".
In contrast, in 6X has the old logic where pg_ctl -w keeps testing if a connection can be made (`test_postmaster_connection`) before returning. So the issue does not occur.
Stop the coordinator on the standby coordinator host after the failed gpactivatestandby command:
gpstop -am
Start the database in normal mode:
gpstart -a
The code fix will be available in Greenplum 7.5 and above.