Unable to recover failed segments

search cancel

Unable to recover failed segments

book

Article ID: 416974

calendar_today

Updated On:

Products

VMware Tanzu Data Suite

Issue/Introduction

After trying to run a gprecoverseg rebalance, the system then goes down due to a double fault.

20251006:00:08:33:2181814 gprecoverseg:cdw:gpadmin-[INFO]:-Stopping unbalanced primary segments...
20251006:00:08:43:2181814 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering segment reconfiguration
20251006:00:21:00:2181814 gprecoverseg:cdw:gpadmin-[CRITICAL]:-gprecoverseg failed. (Reason='Mirror promotion did not complete in 600 seconds.') exiting...

Resolution

If both the Primary and the Mirror are marked down and the system is down due to a double fault situation a manual change to the gp_segment_configuration table may be required, please submit a case with Tanzu VMware Support.

Currently for GPDB version 7.5.2, during a rebalance, the following conditions are checked to determine whether an unbalanced segment pair should be rebalanced:

The segment pair must be up.
The segment pair must be reachable.
The segment pair must be synchonized.

An additional check should be added to verify the mirror's reachability on its configured address.

R&D are currently working on a fix for this issue.

Feedback

thumb_up Yes

thumb_down No