Replication of the ddboost backup to a remote site via gpcrondump ( i.e. gpcrondump run via option "--replicate -max-streams=10") or gpmfr fails with an error message as shown here:
20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_9_20160220145143 on ddboost failed. Err 5056
The file libDDBoost.log (located in the same directory from where the command was called) shows the following error message, indicating that the file that it was trying to replicate was already in use by someone.
02/20 15:55:44: [6A770:EBE740] ddp_open_file() failed for File: GPDB/gp_production2/20160220/gp_dump_0_4_20160220145143, Err: 5056-GPDB/gp_production2/20160220/gp_dump_0_4_20160220145143 is in use on host datadbuf002
This is due to the bad configuration file created on the client side between the data domain backup server and the remote data domain server (DR site). The cause of the problem can be identified from the error message itself:
20160220:14:55:44:017572 gpcrondump:smdw:gpadmin-[WARNING]:-Unable to send dump email notification 20160220:14:55:44:017572 gpcrondump:smdw:gpadmin-[INFO]:-To enable email notification, create /usr/local/GP-4.3.6.1/bin/mail_contacts or /home/gpadmin/mail_contacts containing required email addresses 20160220:14:55:44:017572 gpcrondump:smdw:gpadmin-[INFO]:-Backup to local Data Domain successful. 20160220:14:55:44:017572 gpcrondump:smdw:gpadmin-[INFO]:-Replicating test to remote Data Domain. (gpmfr.py --replicate 20160229152822 --max-streams 10 --master-port 5432) Identifying backup files on local(dd_1.dca) Data Domain. Initiating transfer for 73 files from local(dd_1.dca) to remote(dd_1.dca) Data Domain. Backup 2016-February-20 14:51:43 (20160220145143) already present on remote(dd_1.dca) Data Domain. Using at the most 10 I/O streams on each Data Domain. [ ] 0%20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_9_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_9_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw: -[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_6_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_6_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_2_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_2_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_5_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_5_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_28_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_28_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_10_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_10_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_4_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_4_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_27_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_27_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_7_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_7_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Error in gpddboost: 20160220:14:55:44|gpddboost-[ERROR]:-Opening file gp_production2/20160220/gp_dump_0_30_20160220145143 on ddboost failed. Err 5056 File: gp_dump_0_30_20160220145143, current state: 1, new state: 5 20160220:14:55:44:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-One or more file transfers failed. Aborting all file transfers. All active file transfers terminated. Deleting files from remote(datadbuf002) Data Domain: 0 out of 10 files deleted ...20160220:14:55:51:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Delete failed for backup file gp_production2/20160220/gp_dump_0_6_20160220145143 on Data Domain remote(datadbuf002). Reason: DD Boost error 5004: Backup object not found on DD system. 20160220:14:55:51:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-Failed to delete backup file: gp_dump_0_6_20160220145143 600 731248. 20160220:14:55:51:432945 gpcrondump:greenplum-mdw:gpadmin-[ERROR]:-%d format: a number is required, not str [.......]
As indicated above, the backup data domain server and the remote data domain server are both the same here. Therefore, while one process is opening the file, the other process is failing. This is because it cannot get a lock on the file to replicate as they are on the same host.
On the server from where the backup was executed, perform the steps listed here:
gpcrondump --ddboost-config-remove
gpcrondump --ddboost-host <backup-server> --ddboost-user <backup-ddboost-user> --ddboost-backupdir=<backup-directory>
gpcrondump --ddboost-host <DR-backup-server> --ddboost-user <backup-ddboost-user> --ddboost-backupdir=<backup-directory> --ddboost-remote