The current configuration on the database is:
flightdata=# select * from gp_segment_configuration ; dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts ------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------ 1 | -1 | p | p | s | u | 4270 | mdw | mdw | | 2 | 0 | p | p | s | u | 42700 | sdw3 | sdw3 | 46730 | 4 | 2 | p | p | s | u | 42700 | sdw4 | sdw4 | 46730 | 3 | 1 | p | p | s | u | 42701 | sdw3 | sdw3 | 46731 | 5 | 3 | p | p | s | u | 42701 | sdw4 | sdw4 | 46731 | 6 | 0 | m | m | s | u | 52700 | sdw4 | sdw4 | 56350 | 7 | 1 | m | m | s | u | 52701 | sdw4 | sdw4 | 56351 | 8 | 2 | m | m | s | u | 52700 | sdw3 | sdw3 | 56350 | 9 | 3 | m | m | s | u | 52701 | sdw3 | sdw3 | 56351 | (9 rows)
Ensure that there are no segments that down:
Note: The mirror that you wish to move if its corresponding primary is down , then the gprecoverseg won't work and this procedure is not useful.
flightdata=# select * from gp_segment_configuration where status='d'; dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts ------+---------+------+----------------+------+--------+------+----------+---------+------------------+------------ (0 rows)
Ensure all the segments are on their preferred_role:
flightdata=# select * from gp_segment_configuration where role<> preferred_role; dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts ------+---------+------+----------------+------+--------+------+----------+---------+------------------+------------ (0 rows)
Ensure the new host (for eg.s sdw5
) where the mirror is being moved to, has the binaries installed and no existing directory with the same name:
[gpadmin@mdw faisal]$ echo $GPHOME /usr/local/GP-4.2.7.0 [gpadmin@mdw faisal]$ ssh sdw5 [gpadmin@sdw5 ~]$ ls -ltr /usr/local/ | grep 4.2.7 drwxr-xr-x 12 gpadmin gpadmin 4096 Feb 19 2014 GP-4.2.7.0 lrwxrwxrwx 1 gpadmin gpadmin 10 Feb 19 2014 greenplum-db -> GP-4.2.7.0 [gpadmin@sdw5 ~]$ ls -ltr /data2/mirror/fai_42703 ls: /data2/mirror/fai_42703: No such file or directory [gpadmin@sdw5 ~]$ ls -ltr /data2/mirror/fai_42700 ls: /data2/mirror/fai_42700: No such file or directory
Note: The parent directory should exists (i.e /data2/mirror
in the above example)
Stop the mirror segments that you wish to move. In my cluster, I wish to move seg3 sdw3:52701
and seg0 sdw4:52700
to the new host sdw5
[gpadmin@mdw faisal]$ ssh sdw3 [gpadmin@sdw3 ~]$ pg_ctl stop -D /data2/mirror/fai_42703 waiting for server to shut down.... done server stopped [gpadmin@sdw3 ~]$ exit [gpadmin@mdw faisal]$ ssh sdw4 [gpadmin@sdw4 ~]$ pg_ctl stop -D /data2/mirror/fai_42700 waiting for server to shut down.... done server stopped
Wait for the FTS to update the status of the mirror segments:
flightdata=# select * from gp_segment_configuration where status='d'; dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts ------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------ 9 | 3 | m | m | s | d | 52701 | sdw3 | sdw3 | 56351 | 6 | 0 | m | m | s | d | 52700 | sdw4 | sdw4 | 56350 | (2 rows)
Once done, create the configuration file using the command below:
[gpadmin@mdw faisal]$ gprecoverseg -o /tmp/configuration.out 20150130:03:25:59:011419 gprecoverseg:mdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -o /tmp/configuration.out 20150130:03:25:59:011419 gprecoverseg:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.2.7.0 build 2' 20150130:03:25:59:011419 gprecoverseg:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.2.7.0 build 2) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Dec 10 2013 14:37:53' 20150130:03:25:59:011419 gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master... 20150130:03:25:59:011419 gprecoverseg:mdw:gpadmin-[INFO]:-Configuration file output to /tmp/configuration.out successfully.
The content will look like the following:
filespaceOrder= sdw4:52700:/data2/mirror/fai_42700 sdw3:52701:/data2/mirror/fai_42703
Modify the content to add a line for where the segments moved to in the format below:
Note: You can replace the host
/ port
/ directory
etc. on the below line to whatever is preferred, when the segments are being recovered, gprecoverseg
will take in the necessary changes and recover them on a new host
/ port
/ directory
etc.
<hostname or new hostname>:<port or new port >:<replication port or new replication port>:<directory or new directory location> filespaceOrder= sdw4:52700:/data2/mirror/fai_42700 sdw5:52700:56351:/data2/mirror/fai_42700 sdw3:52701:/data2/mirror/fai_42703 sdw5:52701:56350:/data2/mirror/fai_42703
Recover the segments using the command below (This will do a FULL
recovery as indicated on the "Synchronization mode
" in the below output):
[gpadmin@mdw faisal]$ gprecoverseg -i /tmp/configuration.out 20150130:03:28:27:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -i /tmp/configuration.out 20150130:03:28:27:011605 gprecoverseg:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.2.7.0 build 2' 20150130:03:28:27:011605 gprecoverseg:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.2.7.0 build 2) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Dec 10 2013 14:37:53' 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master... 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Greenplum instance recovery parameters 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:---------------------------------------------------------- 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Recovery from configuration -i option supplied 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:---------------------------------------------------------- 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Recovery 1 of 2 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:---------------------------------------------------------- 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Synchronization mode = Full 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance host = sdw4 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance address = sdw4 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance directory = /data2/mirror/fai_42700 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance port = 52700 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance replication port = 56350 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance host = sdw3 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance address = sdw3 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance directory = /data1/primary/fai_42700 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance port = 42700 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance replication port = 46730 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance host = sdw5 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance address = sdw5 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance directory = /data2/mirror/fai_42700 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance port = 52700 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance replication port = 56351 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:---------------------------------------------------------- 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Recovery 2 of 2 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:---------------------------------------------------------- 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Synchronization mode = Full 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance host = sdw3 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance address = sdw3 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance directory = /data2/mirror/fai_42703 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance port = 52701 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance replication port = 56351 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance host = sdw4 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance address = sdw4 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance directory = /data1/primary/fai_42703 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance port = 42701 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance replication port = 46731 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance host = sdw5 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance address = sdw5 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance directory = /data2/mirror/fai_42703 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance port = 52701 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target instance replication port = 56350 20150130:03:28:28:011605 gprecoverseg:mdw:gpadmin-[INFO]:---------------------------------------------------------- Continue with segment recovery procedure Yy|Nn (default=N): > y 20150130:03:28:29:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Syncing Greenplum Database extensions 20150130:03:28:29:011605 gprecoverseg:mdw:gpadmin-[INFO]:-The packages on sdw5 are consistent. 20150130:03:28:29:011605 gprecoverseg:mdw:gpadmin-[INFO]:-2 segment(s) to recover 20150130:03:28:29:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Ensuring 2 failed segment(s) are stopped 20150130:03:28:29:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Skipping to stop segment /data2/mirror/fai_42700 on host sdw4 as it is not a postgres process 20150130:03:28:29:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Skipping to stop segment /data2/mirror/fai_42703 on host sdw3 as it is not a postgres process 20150130:03:28:29:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Building template directory 20150130:03:28:30:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Validating remote directories . 20150130:03:28:31:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Copying template directory file . 20150130:03:28:32:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Configuring new segments . 20150130:03:28:33:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Cleaning files . 20150130:03:28:34:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Starting file move procedure for sdw5:/data2/mirror/fai_42700:content=0:dbid=6:mode=s:status=d 20150130:03:28:34:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Starting file move procedure for sdw5:/data2/mirror/fai_42703:content=3:dbid=9:mode=s:status=d updating flat files 20150130:03:28:34:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Updating configuration with new mirrors 20150130:03:28:34:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Updating mirrors . 20150130:03:28:35:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Starting mirrors 20150130:03:28:35:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait... ................ 20150130:03:28:51:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Process results... 20150130:03:28:51:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Updating configuration to mark mirrors up 20150130:03:28:51:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Updating primaries 20150130:03:28:51:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Commencing parallel primary conversion of 2 segments, please wait... . 20150130:03:28:52:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Process results... 20150130:03:28:52:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Done updating primaries 20150130:03:28:52:011605 gprecoverseg:mdw:gpadmin-[INFO]:-****************************************************************** 20150130:03:28:52:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Updating segments for resynchronization is completed. 20150130:03:28:52:011605 gprecoverseg:mdw:gpadmin-[INFO]:-For segments updated successfully, resynchronization will continue in the background. 20150130:03:28:52:011605 gprecoverseg:mdw:gpadmin-[INFO]:- 20150130:03:28:52:011605 gprecoverseg:mdw:gpadmin-[INFO]:-Use gpstate -s to check the resynchronization progress. 20150130:03:28:52:011605 gprecoverseg:mdw:gpadmin-[INFO]:-******************************************************************
Use "gpstate -e
" to check the progress of the recovery, once done you will be seeing the message "All segments are running normally
":
[gpadmin@mdw faisal]$ gpstate -e 20150130:03:29:42:012546 gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -e 20150130:03:29:42:012546 gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.2.7.0 build 2' 20150130:03:29:42:012546 gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.2.7.0 build 2) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Dec 10 2013 14:37:53' 20150130:03:29:42:012546 gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master... 20150130:03:29:42:012546 gpstate:mdw:gpadmin-[INFO]:-Gathering data from segments... . 20150130:03:29:43:012546 gpstate:mdw:gpadmin-[INFO]:----------------------------------------------------- 20150130:03:29:43:012546 gpstate:mdw:gpadmin-[INFO]:-Segment Mirroring Status Report 20150130:03:29:43:012546 gpstate:mdw:gpadmin-[INFO]:----------------------------------------------------- 20150130:03:29:43:012546 gpstate:mdw:gpadmin-[INFO]:-All segments are running normally
Cross checking the configuration for the mirror (eg.s content 3
and 0
) shows that it is on the new host, sdw5
:
flightdata=# select * from gp_segment_configuration ; dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts ------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------ 1 | -1 | p | p | s | u | 4270 | mdw | mdw | | 4 | 2 | p | p | s | u | 42700 | sdw4 | sdw4 | 46730 | 3 | 1 | p | p | s | u | 42701 | sdw3 | sdw3 | 46731 | 7 | 1 | m | m | s | u | 52701 | sdw4 | sdw4 | 56351 | 8 | 2 | m | m | s | u | 52700 | sdw3 | sdw3 | 56350 | 2 | 0 | p | p | s | u | 42700 | sdw3 | sdw3 | 46730 | 6 | 0 | m | m | s | u | 52700 | sdw5 | sdw5 | 56351 | 5 | 3 | p | p | s | u | 42701 | sdw4 | sdw4 | 46731 | 9 | 3 | m | m | s | u | 52701 | sdw5 | sdw5 | 56350 | (9 rows)
On the server sdw5
:
[gpadmin@sdw5 ~]$ ps -ef | grep mirror | grep silent gpadmin 5465 1 0 06:28 ? 00:00:00 /usr/local/GP-4.2.7.0/bin/postgres -D /data2/mirror/fai_42700 -p 52700 -b 6 -z 4 --silent-mode=true -i -M quiescent -C 0 gpadmin 5466 1 0 06:28 ? 00:00:00 /usr/local/GP-4.2.7.0/bin/postgres -D /data2/mirror/fai_42703 -p 52701 -b 9 -z 4 --silent-mode=true -i -M quiescent -C 3