When GPDB crashed and start GDPB again, some cluster will hang at below stage and later error out:
[gpadmin@ ~]$ gpstart -a gpstart::gpadmin-[INFO]:-Starting gpstart with args: -a gpstart::gpadmin-[INFO]:-Gathering information and validating the environment... gpstart::gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 6.11.2' gpstart::gpadmin-[INFO]:-Greenplum Catalog Version: 'xxxxx' gpstart::gpadmin-[INFO]:-Starting Master instance in admin mode gpstart::gpadmin-[INFO]:-Obtaining Greenplum Master catalog information gpstart::gpadmin-[INFO]:-Obtaining Segment details from master... gpstart::gpadmin-[CRITICAL]:-gpstart failed. (Reason='FATAL: the database system is starting up DETAIL: last replayed record at 0/0 ') exiting...
After gpstart failed, we can see from the master, there is only logger process and startup process. There is no postgres process at any segment host.
When checking the startup process at the master, we can see it's scaning files from the master data directory. (can be checked via Linux command like # lsof).
Product Version: 6.0
This is due to in GPDB v6, when running crash recovery, the postgres process will scan all files under the related data directory, for the master instance, it will scan all files under $MASTER_DATA_DIRECTORY/.
Please wait for the scaning finish, or contact the Greenplum support team to check further if needed.