Vertica copy cluster now fails after 22.2.5 upgrade that upgrades Vertica to 10_1_1

search cancel

Vertica copy cluster now fails after 22.2.5 upgrade that upgrades Vertica to 10_1_1_20

book

Article ID: 263379

calendar_today

Updated On:

Products

CA Performance Management Network Observability

Issue/Introduction

The copy cluster between our production and backup vertica instance fails after upgrading to the 10.1.1.20 version.

I can run the copy cluster, and it always suggests that one of the nodes (the error is random to which one) is missing a critical file, for example.

]# cat copy_cluster.log
stop the db
Database drdata stopped successfully
sync the db
Error: Missing critical file: [XX.XX.XX.XXX]:/opt/catalog/drdata/v_drdata_node0005_catalog/Snapshots/Copy_drdata.txt
Copycluster FAILED.
Starting copy of database drdata.
Participating nodes: v_drdata_node0001, v_drdata_node0002, v_drdata_node0003, v_drdata_node0004, v_drdata_node0005.
Snapshotting database.
Snapshot complete.

However, in this case, it says node5 is missing the file; I can go to that node and see that the file indeed does exist.

# cd /opt/catalog/drdata/v_drdata_node0005_catalog/Snapshots/
# ll
total 649780
-rw------- 1 <dr admin> verticadba 642861002 Mar 20 09:00 Copy_drdata.ctlg
-rw------- 1 <dr admin> verticadba 11996782 Mar 20 09:00 Copy_drdata.files
-rw------- 1 <dr admin> verticadba 10500768 Mar 20 09:00 Copy_drdata.manifest
-rw------- 1 <dr admin> verticadba 5284 Mar 20 09:00 Copy_drdata.txt
-rw------- 1 <dr admin> verticadba 0 Mar 20 09:00 Copy_drdata.udfs

In the example above:

drdata is the name of the database.
<dr admin> is the name of the DR admin OS user.
/opt/catalog is the path to the catalog directory, this will be different depending on where you placed the directories

The error is always the same other than the node in the error may change.

Environment

DX NetOps Performance Management 22.2

Cause

Defect in the Vertica vbr.py script

Resolution

The fix was to change this line in the /opt/vertica/bin/vbr.py script.

Change this line from:

session_host, db_paths[next(iter(self._participating_nodes))], snap_name)

To this:
session_host, db_paths[init_node], snap_name)

Feedback

thumb_up Yes

thumb_down No