gpcopy hangs when copying a table with a foreign key (FK)
search cancel

gpcopy hangs when copying a table with a foreign key (FK)

book

Article ID: 382835

calendar_today

Updated On:

Products

VMware Tanzu Data Suite Pivotal Data Suite Non Production Edition VMware Tanzu Data Suite Greenplum VMware Tanzu Greenplum

Issue/Introduction

gpcopy with --verbose option shows the same command continually being re-run:

   :
   :
20241122:11:00:07 gpcopy:gpadmin:source-host:142839-[DEBUG]:-[Worker 0] On "dest-host". Connection execute query:
CREATE TEMP TABLE gpcopy_temp_674 (segid int, content text) DISTRIBUTED RANDOMLY
20241122:11:00:08 gpcopy:gpadmin:source-host:142839-[DEBUG]:-[Worker 0] On "dest-host". Connection execute query:
COPY gpcopy_temp_674 FROM PROGRAM 'tmpout=$(gpcopy_helper info --seg-id -1 gpcopy_202411201020_URw7HGprvsVaRiUjh7emmg) && echo "${tmpout}" | sed -e "s/^/-1|/g"' WITH DELIMITER '|'
20241122:11:00:08 gpcopy:gpadmin:source-host:142839-[DEBUG]:-[Worker 0] On "dest-host". Connection select query:
SELECT segid, content FROM gpcopy_temp_674 ORDER BY segid
20241122:11:00:08 gpcopy:gpadmin:source-host:142839-[DEBUG]:-[Worker 0] On "dest-host". Connection execute query:
CREATE TEMP TABLE gpcopy_temp_675 (segid int, content text) DISTRIBUTED RANDOMLY
20241122:11:00:08 gpcopy:gpadmin:source-host:142839-[DEBUG]:-[Worker 0] On "dest-host". Connection execute query:
COPY gpcopy_temp_675 FROM PROGRAM 'tmpout=$(gpcopy_helper info --seg-id <SEGID> gpcopy_202411201020_URw7HGprvsVaRiUjh7emmg) && echo "${tmpout}" | sed -e "s/^/<SEGID>|/g"' WITH DELIMITER '|' ON SEGMENT
20241122:11:00:08 gpcopy:gpadmin:source-host:142839-[DEBUG]:-[Worker 0] On "dest-host". Connection select query:
SELECT segid, content FROM gpcopy_temp_675 ORDER BY segid
20241122:11:00:08 gpcopy:gpadmin:source-host:142839-[DEBUG]:-Transferred data 6TB, transfer rate 6.7TB/h

gpcopy without the --verbose  option just hangs after a successful copy (and possible analyse, if the the option is specified):

   :
   :
20241120:10:45:27 gpcopy:gpadmin:source-host:142839-[INFO]:-[Worker 4] Analyzing table "dbname"."schema01"."table02"
INSERT INTO "schema02"."table01" SELECT * FROM gpcopy_ext_327d7b2e16c54515a30af83e8fe99f8f 
20241120:10:45:27 gpcopy:gpadmin:source-host:142839-[INFO]:-[Worker 0] Analyzing table "dbname"."schema01"."table03"
20241120:10:45:27 gpcopy:gpadmin:source-host:142839-[INFO]:-[Worker 1] Analyzing table "dbname"."schema01"."table04"
DROP EXTERNAL TABLE gpcopy_ext_327d7b2e16c54515a30af83e8fe99f8f
20241120:10:45:32 gpcopy:gpadmin:source-host:142839-[INFO]:-[Worker 2] [Progress: (0/1) DBs, (2972/3442) tables done] Finished copying table "dbname"."schema02"."table01" => "iga_pct
d"."schema02"."table01"
ANALYZE "schema02"."table01"
20241120:10:45:33 gpcopy:gpadmin:source-host:142839-[INFO]:-[Worker 4] Analyzing table "dbname"."schema02"."table01"
20241120:10:45:41 gpcopy:gpadmin:source-host:142839-[INFO]:-[Worker 3] Analyzing table "dbname"."schema03"."table05"

 

Cause

gpcopy will hang when trying to copy a table with a foreign key due to a code bug in the gpcopy.

To check if tables in the source database have foreign keys, run:

SELECT nspname,conrelid::regclass AS table_name 
FROM pg_constraint c
JOIN pg_namespace n ON c.connamespace=n.oid
WHERE contype='f';

Resolution

Fix

As of Novenber 2024, a fix is in development

Workaround 1

  • Generate commands for dropping and adding the constraints. A "WHERE" clause can be used to specify particular namespace names
\o ~/add_constraints.sql
SELECT 'ALTER TABLE ' || c.conrelid::regclass || ' ADD CONSTRAINT ' || c.conname || ' ' || pg_get_constraintdef(c.oid) || ';' AS add
FROM pg_constraint c
JOIN pg_namespace n ON c.connamespace=n.oid
WHERE contype='f' 
-- AND nspname in ('schema1', 'schema2', ...) -- Add this to specify particular schemas, if needed
;

\o ~/drop_constraints.sql
SELECT 'ALTER TABLE ' || c.conrelid::regclass || ' DROP CONSTRAINT ' || c.conname || ';' AS drop
FROM pg_constraint c
JOINpg_namespace n ON c.connamespace=n.oid
WHERE contype='f' 
-- AND nspname in ('schema1', 'schema2', ...) -- Add this to specify particular schemas, if needed
;
  • ON THE SOURCE CLUSTER, run the DROP CONSTRAINT commands from the ~/drop_constraints.sql file
    You can validate that it worked by running the DROP CONSTRAINT query (the one below \o ~/drop_constraints.sql) again and it should return 0 rows
  • Run gpcopy (also add --debug) and it should work fine. It may be best to first try with the --dry-run flag to test it.
  • ON THE SOURCE AND DESTINATION CLUSTERS, run the ADD CONSTRAINT commands from the ~/add_constraints.sql file

Workaround 2

Exclude the tables with a foreign key(FK) from the gpcopy with the --exclude-table option and use gpbackup/gprestore or pg_dump to copy the data between the clusters