Unable to bring up failed segment with error: "PROBE received dbid: doesn't match this segments configured dbid"
search cancel

Unable to bring up failed segment with error: "PROBE received dbid: doesn't match this segments configured dbid"

book

Article ID: 296355

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

When trying to bring up failed original primary segment using gprecoverseg, the original mirror instance (current primary) pg log could report:
PROBE received dbid:xx doesn't match this segments configured dbid xx

Then, from the master host, either gpstate or gp_segment_configuration table will show the current mirror as down. From the mirror instance's host, the mirror instance is up and running. But the segment pg log will only show that it's in recovering mode with no other errors. If we wait for a while, we can also see that the latest xlog at the mirror instance will keep updating (from timestamp and pg_xlogdump output).

This is due to the original mirror (current primary) instance's dbid setting being mismatched with dbid settings from the master node (gp_segment_configuration table).

Check the dbid setting from the current primary:

1. cat $segment_data_directory/internal.auto.conf
2. connect to the current primary instance with utility mode, then run:  select * from pg_settings where name='gp_dbid";

This issue may affect all original mirror instances.

Environment

Product Version: 6.7

Resolution

If only one instance is affected, then:

1. Find the correct dbid from the master host for this original mirror.
2. Shutdown the database.
3. Manually update the $segment_data_directory/internal.auto.conf file (only update the dbid setting).
4. Start the database.
5. Run recovery again.

If multiple instances are affected, then:

1. Generate a fix script from master host:
psql
 \o /tmp/fix.out;
select 'ssh '||hostname||' ''echo gp_dbid='||dbid||' > '||datadir ||'/internal.auto.conf''' from gp_segment_configuration where role='m';
2. Shutdown the database
3. Run this fix script 
4. Start the database
5. Double the "internal.auto.conf" file content with the gp_segment_configuration output.
6. Run recovery if required.