How to replicate table from a database or cluster to a different database or cluster
search cancel

How to replicate table from a database or cluster to a different database or cluster

book

Article ID: 295796

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

This article explains provides a method to move a table from clusters or database to a different database or cluster without the need for a backup and a restore. It very useful if developers want a replica of the original tables for their development and testing on a different database.


Note: Please verify the script on a test cluster before running it on production.


Environment


Resolution

Use this script that is needed to replicate the table quickly to another database:

pg_dump -t <schema-name.source-table-name> -h <source-host-name> <source-database-name> | psql -h <destination-host-name> -U <destination-user> -p <destination-port> <destination-database>


Note: There is another tool "gpbackup" that can be used for data transfer (gpbackup is included in v4.3.26.0 and later releases).


Example:

1. Create a database and table. Generate a source database.

gpadmin=# create database d1;
CREATE DATABASE
Time: 8855.715 ms

gpadmin=# \c d1
You are now connected to database "d1" as user "gpadmin".

2. Creating a table to replicate data and load data into it.

d1=# create table t1 as select * from pg_attribute;
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column(s) named 'attrelid' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
SELECT 2998
Time: 1705.794 ms

3. Create the destination database:

d1=# create database d2;
CREATE DATABASE
Time: 10753.638 ms
d1=# \q

4. Run the command to backup and load data simultaneously.

gpadmin:Fullrack@mdw $ pg_dump -C -t t1 d1 | psql d2
Timing is on.
Pager usage is off.
SET
Time: 184.239 ms
SET
Time: 0.740 ms
SET
Time: 0.677 ms
SET
Time: 0.652 ms
SET
Time: 0.598 ms
SET
Time: 0.635 ms
SET
Time: 1.565 ms
SET
Time: 0.640 ms
SET
Time: 0.601 ms
CREATE TABLE
Time: 66.210 ms
ALTER TABLE
Time: 18.538 ms
Time: 1247.149 ms

Note: This data loaded to a different database on the same cluster. So, many parameters that are mentioned above are removed as they are not needed here. To replicate them on a different cluster ensure you have the pg_hba.conf is updated in the source location with the correct information. This will allow the the psql to connect to the source host and database without any hassle.

5. Cross verification shows everything is worked successfully and the table is loaded on the destination database.

gpadmin:Fullrack@mdw $ psql d2
 Timing is on.
 Pager usage is off.
 psql (8.2.15)
 Type "help" for help.

d2=# \d
              List of relations
  Schema | Name | Type  |  Owner  | Storage
 --------+------+-------+---------+---------
  public | t1   | table | gpadmin | heap
 (1 row)

d2=# select count(*) from t1;
  count
 -------
   2998
 (1 row) 
Time: 82.246 ms

Tip:

If the source table has many grant permission then the pg_dump will run them on the destination, to eliminate the grant (permission part) use the command below:

pg_dump -t <schema-name.source-table-name> -h <source-host-name> <source-database-name> | egrep -v "GRANT|REVOKE" | psql -h <destination-host-name> -U <destination-user> -p <destination-port> <destination-database>

To load just the data use pg_dump with -a option.

pg_dump -a -t <schema-name.source-table-name> -h <source-host-name> <source-database-name> | psql -h <destination-host-name> -U <destination-user> -p <destination-port> <destination-database>

For more options refer to the help from pg_dump:

pg_dump --help



Additional Information

If the data set is very large, check gptransfer utility, here is an example on how to use it.
 

It is possible to use psql to do the same task. Make sure the DDL of the table is available on the destination.

psql -c "copy (select * from <source-table>) to stdout" <source-database> | psql -c "copy <destination-table> from stdin" <destination database>

Related articles