Unable to recover failed segments
search cancel

Unable to recover failed segments

book

Article ID: 416974

calendar_today

Updated On:

Products

VMware Tanzu Data Suite

Issue/Introduction

After trying to run a gprecoverseg rebalance, the system then goes down due to a double fault.

20251006:00:08:33:2181814 gprecoverseg:cdw:gpadmin-[INFO]:-Stopping unbalanced primary segments...
20251006:00:08:43:2181814 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering segment reconfiguration
20251006:00:21:00:2181814 gprecoverseg:cdw:gpadmin-[CRITICAL]:-gprecoverseg failed. (Reason='Mirror promotion did not complete in 600 seconds.') exiting... 

Resolution

If both the Primary and the Mirror are marked down and the system is down due to a double fault situation a manual change to the gp_segment_configuration table may be required, please submit a case with Tanzu VMware Support.

Currently for GPDB version 7.5.2, during a rebalance, the following conditions are checked to determine whether an unbalanced segment pair should be rebalanced:

  • The segment pair must be up.
  • The segment pair must be reachable.
  • The segment pair must be synchonized.

An additional check should be added to verify the mirror's reachability on its configured address.

R&D are currently working on a fix for this issue.