Queries fail with AO checksum errors on Pivotal Greenplum 4.x and 5.x

Products

VMware Tanzu Greenplum

Issue/Introduction

Symptoms:
The queries fail with an error similar to the one that is shown below:

ERROR:  Header checksum does not match.  Expected 0x14C682B and found 0x97777892 (cdbappendonlystorageread.c:856)  (seg0 slice1 sdw1:40000 pid=128842) (cdbdisp.c:1322)
DETAIL:  
Append-Only storage header kind 0 unknown
Scan of Append-Only Column-Oriented relation 'foo', column #10 'value'. Append-Only segment file 'base/17150/37815.1152', block header offset in file = 0, bufferCount 1

ERROR:  Bad append-only storage header.  Header check error 3, detail 'Append-only storage header is invalid -- invalid value 0 (none) for header kind (header_bytes_0_3 0x00190934, header_bytes_4_7 0x844ba000)' (cdbappendonlystorageread.c:873)  (seg0 slice1 sdw1:40000 pid=128858) (cdbdisp.c:1322)
DETAIL:  
Append-Only storage header kind 0 unknown
Scan of Append-Only Column-Oriented relation 'foo', column #10 'value'. Append-Only segment file 'base/17150/37815.1152', block header offset in file = 0, bufferCount 1

The error message provides the following information that will be helpful while recovering :

Segment content=seg0
Segment host=sdw1
Segment port=40000
Database OID=17150
Tablename='foo'
Affected column name='value'
TableDatafile . column=37815.1152 i.e datafile(relfilenode) to be referred here is 37815

Environment

Cause

Any hardware error can cause these kind of errors.

Resolution

The resolution is divided into two steps i.e identification and recovery.

Identify the relation:

Login to the affected segment. In this case, it is sdw1.

PGOPTIONS='-c gp_session_role=utility' psql -p 40000 -h sdw1 gpadmin

Using the table datafile, check relation from pg_class:

SELECT p.oid,n.nspname||'.'||p.relname as relation ,p.relfilenode FROM pg_class p , pg_namespace n where p.relnamespace=n.oid and p.relfilenode= <--Relfilenode--from--error--> ;

Example:

gpadmin=# SELECT p.oid,n.nspname||'.'||p.relname as relation ,p.relfilenode FROM pg_class p , pg_namespace n where p.relnamespace=n.oid and p.relfilenode=37815 ; 
  oid  |   relation    | relfilenode 
-------+---------------+-------------
 37815 | public.foo    |       37815
(1 row)

Recover the affected table

The following are the steps to get rid of this error:

Truncate and Reload- If the data can be reloaded then, a truncate can be done to clean out the impacted datafiles followed by the table data restore.
If the data cannot be reloaded then, check if the mirror has the valid data

It is possible that the primary is affected because of the hardware issues but mirror copy is good. Check the primary and mirror configuration and compare md5sum on the affected file to identify this .

Steps:

Check primary and mirror configuration

psql -c "SELECT g.hostname,p.fselocation from gp_segment_configuration g , pg_filespace_entry p where g.dbid=p.fsedbid and g.content= <refer--segX--from-ERROR>"

Example:

---------------------------------------------------------------
psql -c "SELECT g.hostname,p.fselocation from gp_segment_configuration g , pg_filespace_entry p where g.dbid=p.fsedbid and g.content= 0" 
 hostname   |   fselocation        
----------+--------------------------
 sdw1 | /data1/primary/gpseg0
 sdw2 | /data2/mirror/gpseg0
(2 rows)
-------------------------------------

Compare md5sum (Use file location from the error):

ssh hostname md5sum fselocation/<--location-from-ERROR>

Example for this scenario:

ssh sdw1 md5sum /data1/primary/gpseg0/base/17150/37815.1152
b46b494f29d73e3cd42c0126b383f3ef  /data1/primary/gpseg0/base/17150/37815.1152
ssh sdw2 md5sum  /data2/mirror/gpseg0/base/17150/37815.1152
52c080e7a92d0b78f99e7760ac48ca03  /data2/mirror/gpseg0/base/17150/37815.1152

The mirror has a different md5sum value than primary. Since nothing has been written on the datafiles because of the checksum error, the mirror has a valid copy available.

The next action is to perform failover and validate data. the followiing are the steps for that:

Make sure the mirror is not already down by running below command. Both the segments should show up and should be in sync. Do not proceed further if the segment status from below is 'd' for either primary or mirror.

psql -c "SELECT hostname,content,role,port,mode,status from gp_segment_configuration where content= <value--from-ERROR>"  <<<< refer in ERROR  >>seg0<<"

Example for this scenario:

psql -c "SELECT hostname,content,role,port,mode,status from gp_segment_configuration where content=0"
     hostname   | content | role | port  | mode | status 
--------------------------+---------+------+-------+------+--------
      sdw1      |       0 | p    | 40000 | s    | u
      sdw2      |       0 | m    | 50000 | s    | u
(2 rows)

Login to the server with primary and stop segment. In this case it is sdw1.

ssh hostname
source $GPHOME/greenplum_path.sh  
pg_ctl -D fselocation stop -m fast

Example for this scenario:

ssh sdw1
source /usr/local/greenplum/db/greenplum_path.sh 
pg_ctl -D /data1/primary/gpseg0 stop -m fast

The directory location is from the step "Check primary/mirror configuration". You can also run ps -ef | grep postgres | grep select | grep 40000 on the primary host to get the segment directory

Once the segment stops and the failover is completed, check the table again by running the failing select statement. The segment failover can take a couple of minutes.

After the successful data access on the table, a full recovery is needed to take care of any other affected relation.

Contact Broadcom Support if you encounter any issues.