ERROR: Header checksum does not match. Expected 0x14C682B and found 0x97777892 (cdbappendonlystorageread.c:856) (seg0 slice1 sdw1:40000 pid=128842) (cdbdisp.c:1322) DETAIL: Append-Only storage header kind 0 unknown Scan of Append-Only Column-Oriented relation 'foo', column #10 'value'. Append-Only segment file 'base/17150/37815.1152', block header offset in file = 0, bufferCount 1
ERROR: Bad append-only storage header. Header check error 3, detail 'Append-only storage header is invalid -- invalid value 0 (none) for header kind (header_bytes_0_3 0x00190934, header_bytes_4_7 0x844ba000)' (cdbappendonlystorageread.c:873) (seg0 slice1 sdw1:40000 pid=128858) (cdbdisp.c:1322) DETAIL: Append-Only storage header kind 0 unknown Scan of Append-Only Column-Oriented relation 'foo', column #10 'value'. Append-Only segment file 'base/17150/37815.1152', block header offset in file = 0, bufferCount 1The error message provides the following information that will be helpful while recovering :
The resolution is divided into two steps i.e identification and recovery.
PGOPTIONS='-c gp_session_role=utility' psql -p 40000 -h sdw1 gpadmin
SELECT p.oid,n.nspname||'.'||p.relname as relation ,p.relfilenode FROM pg_class p , pg_namespace n where p.relnamespace=n.oid and p.relfilenode= <--Relfilenode--from--error--> ;
gpadmin=# SELECT p.oid,n.nspname||'.'||p.relname as relation ,p.relfilenode FROM pg_class p , pg_namespace n where p.relnamespace=n.oid and p.relfilenode=37815 ; oid | relation | relfilenode -------+---------------+------------- 37815 | public.foo | 37815 (1 row)
The following are the steps to get rid of this error:
It is possible that the primary is affected because of the hardware issues but mirror copy is good. Check the primary and mirror configuration and compare md5sum on the affected file to identify this .
Steps:
psql -c "SELECT g.hostname,p.fselocation from gp_segment_configuration g , pg_filespace_entry p where g.dbid=p.fsedbid and g.content= <refer--segX--from-ERROR>"
--------------------------------------------------------------- psql -c "SELECT g.hostname,p.fselocation from gp_segment_configuration g , pg_filespace_entry p where g.dbid=p.fsedbid and g.content= 0" hostname | fselocation ----------+-------------------------- sdw1 | /data1/primary/gpseg0 sdw2 | /data2/mirror/gpseg0 (2 rows) -------------------------------------
ssh hostname md5sum fselocation/<--location-from-ERROR>
ssh sdw1 md5sum /data1/primary/gpseg0/base/17150/37815.1152 b46b494f29d73e3cd42c0126b383f3ef /data1/primary/gpseg0/base/17150/37815.1152 ssh sdw2 md5sum /data2/mirror/gpseg0/base/17150/37815.1152 52c080e7a92d0b78f99e7760ac48ca03 /data2/mirror/gpseg0/base/17150/37815.1152
The mirror has a different md5sum value than primary. Since nothing has been written on the datafiles because of the checksum error, the mirror has a valid copy available.
The next action is to perform failover and validate data. the followiing are the steps for that:
psql -c "SELECT hostname,content,role,port,mode,status from gp_segment_configuration where content= <value--from-ERROR>" <<<< refer in ERROR >>seg0<<"
psql -c "SELECT hostname,content,role,port,mode,status from gp_segment_configuration where content=0" hostname | content | role | port | mode | status --------------------------+---------+------+-------+------+-------- sdw1 | 0 | p | 40000 | s | u sdw2 | 0 | m | 50000 | s | u (2 rows)
ssh hostname source $GPHOME/greenplum_path.sh pg_ctl -D fselocation stop -m fast
ssh sdw1 source /usr/local/greenplum/db/greenplum_path.sh pg_ctl -D /data1/primary/gpseg0 stop -m fast
ps -ef | grep postgres | grep select | grep 40000
on the primary host to get the segment directory
After the successful data access on the table, a full recovery is needed to take care of any other affected relation.
Contact Broadcom Support if you encounter any issues.