1. Logon to HAWQ master host as gpadmin and check HAWQ cluster is running normally with gpstate-s. For example:
[gpadmin@hdm2 ~]$ gpstate -s 20140716:23:43:29:448564 gpstate:hdm2:gpadmin-[INFO]:-Starting gpstate with args: -s 20140716:23:43:30:448564 gpstate:hdm2:gpadmin-[INFO]:-local HAWQ Version: 'postgres (HAWQ) 4.2.0 build 1' 20140716:23:43:30:448564 gpstate:hdm2:gpadmin-[INFO]:-master HAWQ Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.2.0 build 1) (HAWQ 1.2.0.1 build 8119) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Apr 23 2014 16:12:32' 20140716:23:43:31:448564 gpstate:hdm2:gpadmin-[INFO]:-Obtaining Segment details from master... 20140716:23:43:31:448564 gpstate:hdm2:gpadmin-[INFO]:-Gathering data from segments... ....... 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:--Master Configuration & Status 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Master host = hdm2 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Master postgres process ID = 386183 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Master data directory = /data1/master/gpseg-1 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Master port = 5432 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Master current role = dispatch 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- HAWQ initsystem version = 1.2.0.1 build 8119 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- HAWQ current version = PostgreSQL 8.2.15 (Greenplum Database 4.2.0 build 1) (HAWQ 1.2.0.1 build 8119) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Apr 23 2014 16:12:32 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Postgres version = 8.2.15 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Master standby = hdm1 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Standby master state = Standby host passive 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:-Segment Instance Status Report 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Segment Info 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Hostname = hdw1.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Address = hdw1.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Datadir = /data1/primary/gpseg0 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Port = 40000 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Status 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- PID = 152210 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Configuration reports status as = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Database status = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Segment Info 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Hostname = hdw1.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Address = hdw1.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Datadir = /data1/primary/gpseg1 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Port = 40001 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Status 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- PID = 152209 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Configuration reports status as = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Database status = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Segment Info 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Hostname = hdw2.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Address = hdw2.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Datadir = /data1/primary/gpseg2 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Port = 40000 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Status 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- PID = 151852 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Configuration reports status as = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Database status = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Segment Info 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Hostname = hdw2.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Address = hdw2.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Datadir = /data1/primary/gpseg3 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Port = 40001 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Status 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- PID = 151851 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Configuration reports status as = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Database status = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Segment Info 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Hostname = hdw3.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Address = hdw3.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Datadir = /data1/primary/gpseg4 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Port = 40000 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Status 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- PID = 152179 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Configuration reports status as = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Database status = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Segment Info 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Hostname = hdw3.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Address = hdw3.hadoop.local 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Datadir = /data1/primary/gpseg5 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Port = 40001 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Status 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- PID = 152180 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Configuration reports status as = Up 20140716:23:43:38:448564 gpstate:hdm2:gpadmin-[INFO]:- Database status = Up
2. Run the following two commands to generate scripts for deleting data directories on local file system and HDFS:
psql -Atc "select 'hdfs dfs -rm -r -f ' || fselocation from pg_filespace_entry where fselocation like 'hdfs://%'" template1 > rm-hdfs.sh psql -Atc "select 'ssh ' || c.hostname || ' ''rm -rf '|| f.fselocation ||'''' from gp_segment_configuration c, pg_filespace_entry f where c.dbid=f.fsedbid and f.fselocation not like 'hdfs://%'" template1 > rm-local.sh
If there is no need to keep HAWQ cluster data on HDFS, use this command instead of the first one.
psql -Atc "select 'hdfs dfs -rm -r -f -skipTrash ' || fselocation from pg_filespace_entry where fselocation like 'hdfs://%'" template1 > rm-hdfs.sh
An example is as shown below:
[gpadmin@hdm2 ~]$ psql -Atc "select 'hdfs dfs -rm -r -f ' || fselocation from pg_filespace_entry where fselocation like 'hdfs://%'" template1 > rm-hdfs.sh [gpadmin@hdm2 ~]$ cat rm-hdfs.sh hdfs dfs -rm -r -f hdfs://gphd201/hawq_data/gpseg0 hdfs dfs -rm -r -f hdfs://gphd201/hawq_data/gpseg2 hdfs dfs -rm -r -f hdfs://gphd201/hawq_data/gpseg4 hdfs dfs -rm -r -f hdfs://gphd201/hawq_data/gpseg1 hdfs dfs -rm -r -f hdfs://gphd201/hawq_data/gpseg3 hdfs dfs -rm -r -f hdfs://gphd201/hawq_data/gpseg5 [gpadmin@hdm2 ~]$ psql -Atc "select 'ssh ' || c.hostname || ' ''rm -rf '|| f.fselocation ||'''' from gp_segment_configuration c, pg_filespace_entry f where c.dbid=f.fsedbid and f.fselocation not like 'hdfs://%'" template1 > rm-local.sh [gpadmin@hdm2 ~]$ cat rm-local.sh ssh hdw1.hadoop.local 'rm -rf /data1/primary/gpseg0' ssh hdw2.hadoop.local 'rm -rf /data1/primary/gpseg2' ssh hdw3.hadoop.local 'rm -rf /data1/primary/gpseg4' ssh hdw1.hadoop.local 'rm -rf /data1/primary/gpseg1' ssh hdw2.hadoop.local 'rm -rf /data1/primary/gpseg3' ssh hdw3.hadoop.local 'rm -rf /data1/primary/gpseg5' ssh hdm2 'rm -rf /data1/master/gpseg-1' ssh hdm2 'rm -rf /data1/master/dfs/gpseg-1' ssh hdm1 'rm -rf /data1/master/gpseg-1' ssh hdm1 'rm -rf /data1/master/dfs/gpseg-1'
3. Grant execution permission to the SHELL scripts just created:
[gpadmin@hdm2 ~]$ chmod +x rm-hdfs.sh rm-local.sh [gpadmin@hdm2 ~]$ ls -l rm-hdfs.sh rm-local.sh -rwxr-xr-x 1 gpadmin gpadmin 306 Jul 16 23:46 rm-hdfs.sh -rwxr-xr-x 1 gpadmin gpadmin 486 Jul 16 23:48 rm-local.sh
4. Stop the HAWQ cluster:
[gpadmin@hdm2 ~]$ gpstop -a 20140716:23:55:18:448905 gpstop:hdm2:gpadmin-[INFO]:-Starting gpstop with args: -a 20140716:23:55:18:448905 gpstop:hdm2:gpadmin-[INFO]:-Gathering information and validating the environment... 20140716:23:55:19:448905 gpstop:hdm2:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information 20140716:23:55:19:448905 gpstop:hdm2:gpadmin-[INFO]:-Obtaining Segment details from master... 20140716:23:55:20:448905 gpstop:hdm2:gpadmin-[INFO]:-Greenplum Version: 'postgres (HAWQ) 4.2.0 build 1' 20140716:23:55:20:448905 gpstop:hdm2:gpadmin-[INFO]:-There are 0 connections to the database 20140716:23:55:20:448905 gpstop:hdm2:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='smart' 20140716:23:55:20:448905 gpstop:hdm2:gpadmin-[INFO]:-Master host=hdm2 20140716:23:55:20:448905 gpstop:hdm2:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=smart 20140716:23:55:20:448905 gpstop:hdm2:gpadmin-[INFO]:-Master segment instance directory=/data1/master/gpseg-1 20140716:23:55:21:448905 gpstop:hdm2:gpadmin-[INFO]:-Stopping gpsyncmaster on standby host hdm1 mode=fast 20140716:23:55:24:448905 gpstop:hdm2:gpadmin-[INFO]:-Successfully shutdown sync process on hdm1 20140716:23:55:24:448905 gpstop:hdm2:gpadmin-[INFO]:-Commencing parallel segment instance shutdown, please wait... ...... 20140716:23:55:30:448905 gpstop:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:55:30:448905 gpstop:hdm2:gpadmin-[INFO]:- Segments stopped successfully = 6 20140716:23:55:30:448905 gpstop:hdm2:gpadmin-[INFO]:- Segments with errors during stop = 0 20140716:23:55:30:448905 gpstop:hdm2:gpadmin-[INFO]:----------------------------------------------------- 20140716:23:55:30:448905 gpstop:hdm2:gpadmin-[INFO]:-Successfully shutdown 6 of 6 segment instances 20140716:23:55:30:448905 gpstop:hdm2:gpadmin-[INFO]:-Database successfully shutdown with no errors reported
5. Run the scripts to delete data directories on local file system and HDFS:
[gpadmin@hdm2 ~]$ ./rm-local-fs.sh [gpadmin@hdm2 ~]$ ./rm-hdfs.sh Moved: 'hdfs://gphd201/hawq_data/gpseg0' to trash at: hdfs://gphd201/user/gpadmin/.Trash/Current Moved: 'hdfs://gphd201/hawq_data/gpseg2' to trash at: hdfs://gphd201/user/gpadmin/.Trash/Current Moved: 'hdfs://gphd201/hawq_data/gpseg4' to trash at: hdfs://gphd201/user/gpadmin/.Trash/Current Moved: 'hdfs://gphd201/hawq_data/gpseg1' to trash at: hdfs://gphd201/user/gpadmin/.Trash/Current Moved: 'hdfs://gphd201/hawq_data/gpseg3' to trash at: hdfs://gphd201/user/gpadmin/.Trash/Current Moved: 'hdfs://gphd201/hawq_data/gpseg5' to trash at: hdfs://gphd201/user/gpadmin/.Trash/Current
If any error occurs when run the scripts, please check specific command that fails and figure out the reason and then run it again manually.
6. Double check the data directories have been removed from local file system and HDFS.
7. Re-initialize the HAWQ cluster with gpinitsystem again if necessary.