gprecoverseg
utilizes gp_primarymirror
to check the primary and the mirror segment states in order to determine if the segments are "ready" for recovery.
Example:
20161005:10:42:00:062639 gprecoverseg:xxx:gpadmin-[DEBUG]:-[worker4] finished cmd: Get segment status cmdStr='ssh -o 'StrictHostKeyChecking no' ecdlnjqgrpdb01 ". /bb/gpdata/greenplum-db/./greenplum_path.sh; $GPHOME/bin/gp_primarymirror -h ecdlnjqgrpdb01 -p 40001"' had result: cmd had rc=1 completed=True halted=False stdout='' stderr='mode: PrimarySegment segmentState: Ready dataState: InSync faultType: NotInitialized mode: PrimarySegment segmentState: Ready dataState: InSync faultType: NotInitialized
gprecoverseg
expects the value to be returned in stderr
and in a specific format, where it parses the necessary values out.
If gprecoverseg
encounters additional information returned by SSH, it may assume that the segment is not "Ready" and retry the operation several times before finally terminating it.
In this scenario, some X11 forwarding changes had been made to the gpadmin
profile on several segments, causing various errors to be returned:
- expected response - 20161005:10:42:00:062639 gprecoverseg:xxx:gpadmin-[DEBUG]:-[worker4] finished cmd: Get segment status cmdStr='ssh -o 'StrictHostKeyChecking no' ecdlnjqgrpdb01 ". /bb/gpdata/greenplum-db/./greenplum_path.sh; $GPHOME/bin/gp_primarymirror -h ecdlnjqgrpdb01 -p 40001"' had result: cmd had rc=1 completed=True halted=False stdout='' stderr='mode: PrimarySegment ... - error message 1 - 20161005:10:42:01:062639 gprecoverseg:xxx:gpadmin-[DEBUG]:-[worker7] finished cmd: Get segment status cmdStr='ssh -o 'StrictHostKeyChecking no' ecdlnjqgrpdb02 ". /bb/gpdata/greenplum-db/./greenplum_path.sh; $GPHOME/bin/gp_primarymirror -h ecdlnjqgrpdb02 -p 50003"' had result: cmd had rc=1 completed=True halted=False stdout='' stderr='/usr/bin/xauth: error in locking authority file /home/gpadmin/.Xauthority - error message 2 - 20161005:10:42:01:062639 gprecoverseg:xxx:gpadmin-[DEBUG]:-[worker6] finished cmd: Get segment status cmdStr='ssh -o 'StrictHostKeyChecking no' ecdlnjqgrpdb01 ". /bb/gpdata/greenplum-db/./greenplum_path.sh; $GPHOME/bin/gp_primarymirror -h ecdlnjqgrpdb01 -p 40002"' had result: cmd had rc=1 completed=True halted=False stdout='' stderr='Warning: No xauth data; using fake authentication data for X11 forwarding. - exception - 20161005:10:42:06:062639 gprecoverseg:ecdlnjqgrpms01:gpadmin-[ERROR]:-gprecoverseg failed. exiting... Traceback (most recent call last): File "/bb/gpdata/greenplum-db/lib/python/gppylib/mainUtils.py", line 281, in simple_main_locked exitCode = commandObject.run() File "/bb/gpdata/greenplum-db/lib/python/gppylib/programs/clsRecoverSegment.py", line 1266, in run raise Exception("Inconsistency in catalog and segment Role/Mode. Catalog Role = %s. Segment Mode = %s." % (db.getSegmentRole(), mode)) Exception: Inconsistency in catalog and segment Role/Mode. Catalog Role = p. Segment Mode = error in locking authority file /home/gpadmin/.Xauthority.
For this issue, modify the SSH config to disable X11 via ~/.ssh/config
:
Host * ForwardAgent no ForwardX11 no