Greenplum Database 6.x: Creating UNLOGGED tables and their behavior when there is a segment failover

search cancel

Greenplum Database 6.x: Creating UNLOGGED tables and their behavior when there is a segment failover

book

Article ID: 296307

calendar_today

Updated On: 04-18-2024

Products

VMware Tanzu Greenplum

Issue/Introduction

What are unlogged tables?

UNLOGGED
If specified, the table is created as an unlogged table. Data written to unlogged tables is not written to the write-ahead (WAL) log, which makes them considerably faster than ordinary tables. However, the contents of an unlogged table are not replicated to mirror segment instances. Also an unlogged table is not crash-safe. After a segment instance crash or unclean shutdown, the data for the unlogged table on that segment is truncated. Any indexes created on an unlogged table are automatically unlogged as well.

How to create unlogged tables?

create unlogged table test_unlogged (a int, b int, c text, d date) ;

gpadmin=# \d+ test_unlogged 
                Unlogged table "public.test_unlogged"
Column |  Type   | Modifiers | Storage  | Stats target | Description 
--------+---------+-----------+----------+--------------+-------------
a      | integer |           | plain    |              | 
b      | integer |           | plain    |              | 
c      | text    |           | extended |              | 
d      | date    |           | plain    |              | 
Distributed by: (a)

How to identify unlogged tables in the database?

create unlogged table test_unlogged (a int, b int, c text, d date) ;

gpadmin=# \d+ test_unlogged 
                Unlogged table "public.test_unlogged"
Column |  Type   | Modifiers | Storage  | Stats target | Description 
--------+---------+-----------+----------+--------------+-------------
a      | integer |           | plain    |              | 
b      | integer |           | plain    |              | 
c      | text    |           | extended |              | 
d      | date    |           | plain    |              | 
Distributed by: (a)

Our unlogged table "test_unlogged" has 4000 rows:

gpadmin=# select count(*) from test_unlogged;
count 
-------
  4000
(1 row)

How many files are created for unlogged tables on the primary vs the mirror?

Make a note of the relfilenode using the following SQL (change relname to your unlogged table) and select a content=1. For example, the gp_segment_id=1 has relfilenode value of "49174".

gpadmin=# select gp_segment_id, relname, relnamespace, reltuples, relpages, reltype, relkind, relfilenode from gp_dist_random('pg_class') where relname='test_unlogged' order by 1;
gp_segment_id |    relname    | relnamespace | reltuples | relpages | reltype | relkind | relfilenode 
---------------+---------------+--------------+-----------+----------+---------+---------+-------------
             0 | test_unlogged |         2200 |         0 |        0 |   24647 | r       |       24603
             1 | test_unlogged |         2200 |         0 |        0 |   24647 | r       |       49174
             2 | test_unlogged |         2200 |         0 |        0 |   24647 | r       |       24603
             3 | test_unlogged |         2200 |         0 |        0 |   24647 | r       |       24603
(4 rows)

2. List the file "49174" on the primary and mirror for gpseg1.
**Note that the file does not exist on the mirror since unlogged tables do not get replicated from primary to mirror because of non-existence WAL entries.

[gpadmin@gpdbnew-m ~]$ gpssh -h gpdbnew-1 -h gpdbnew-2 "ls -ltr /data/primary/*/base/16384 | grep 49174"
[gpdbnew-2]
[gpdbnew-1] -rw-------. 1 gpadmin gpadmin      0 Apr 13 18:40 49174_init
[gpdbnew-1] -rw-------. 1 gpadmin gpadmin  98304 Apr 18 16:28 49174_fsm
[gpdbnew-1] -rw-------. 1 gpadmin gpadmin  65536 Apr 18 16:28 49174

[gpadmin@gpdbnew-m ~]$ gpssh -h gpdbnew-1 -h gpdbnew-2 "ls -ltr /data/mirror/*/base/16384 | grep 49174"
[gpdbnew-2] -rw-------. 1 gpadmin gpadmin      0 Apr 13 18:40 49174_init
[gpdbnew-1]

3. As a test, lets stop the primary for gpseg1:

[gpadmin@gpdbnew-1 ~]$ source /usr/local/6.19.4/greenplum-db-6.19.4/greenplum_path.sh
[gpadmin@gpdbnew-1 ~]$ pg_ctl stop -m fast -p 30007 -D /data/primary/gp_6.19.4_202204041858361
waiting for server to shut down.... done
server stopped

4. Now lets check the file listing on the primary and mirror. **Note that the mirror now has the file "49174" but it is 0 byte. This means the mirror copy of the table does not have any data.

[gpadmin@gpdbnew-m ~]$ gpssh -h gpdbnew-1 -h gpdbnew-2 "ls -ltr /data/primary/*/base/16384 | grep 49174"
[gpdbnew-1] -rw-------. 1 gpadmin gpadmin      0 Apr 13 18:40 49174_init
[gpdbnew-1] -rw-------. 1 gpadmin gpadmin  98304 Apr 18 16:28 49174_fsm
[gpdbnew-1] -rw-------. 1 gpadmin gpadmin  65536 Apr 18 16:37 49174
[gpdbnew-2]
[gpadmin@gpdbnew-m ~]$ gpssh -h gpdbnew-1 -h gpdbnew-2 "ls -ltr /data/mirror/*/base/16384 | grep 49174"
[gpdbnew-1]
[gpdbnew-2] -rw-------. 1 gpadmin gpadmin      0 Apr 13 18:40 49174_init
[gpdbnew-2] -rw-------. 1 gpadmin gpadmin      0 Apr 18 16:39 49174

5. Count on the table now proves it has missing rows (Originally the table had 4000 rows):

gpadmin=# select count(*) from test_unlogged;
count 
-------
  2996
(1 row)

6. We now recover the failed primary for content=1 using "gprecoverseg -a". This performs an incremental recovery of the segment.

[gpadmin@gpdbnew-m ~]$ gprecoverseg -a
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Starting gprecoverseg with args: -a
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.19.4 build commit:953778b47d418bb463e4abb2d982ba27dd281010'
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.26 (Greenplum Database 6.19.4 build commit:953778b47d418bb463e4abb2d982ba27dd281010) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Mar  9 2022 00:54:06'
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Obtaining Segment details from master...
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Heap checksum setting is consistent between master and the segments that are candidates for recoverseg
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Greenplum instance recovery parameters
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:----------------------------------------------------------
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Recovery type              = Standard
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:----------------------------------------------------------
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Recovery 1 of 1
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:----------------------------------------------------------
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Synchronization mode                 = Incremental
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Failed instance host                 = gpdbnew-1
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Failed instance address              = gpdbnew-1
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Failed instance directory            = /data/primary/gp_6.19.4_202204041858361
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Failed instance port                 = 30007
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Recovery Source instance host        = gpdbnew-2
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Recovery Source instance address     = gpdbnew-2
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Recovery Source instance directory   = /data/mirror/gp_6.19.4_202204041858361
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Recovery Source instance port        = 35007
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-   Recovery Target                      = in-place
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:----------------------------------------------------------
20220418:16:41:50:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments
20220418:16:41:51:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections
20220418:16:41:51:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-1 segment(s) to recover
20220418:16:41:51:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20220418:16:41:52:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Ensuring that shared memory is cleaned up for stopped segments
20220418:16:41:52:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Updating configuration with new mirrors
20220418:16:41:52:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Updating mirrors
20220418:16:41:52:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Running pg_rewind on failed segments
gpdbnew-1 (dbid 3): no rewind required
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Starting mirrors
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-era is cd012e9153790fff_220418162752
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Process results...
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Triggering FTS probe
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-********************************
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Segments successfully recovered.
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-********************************
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Recovered mirror segments need to sync WAL with primary segments.
20220418:16:41:54:008569 gprecoverseg:gpdbnew-m:gpadmin-[INFO]:-Use 'gpstate -e' to check progress of WAL sync remaining bytes

7. Recovery of gpseg1 is now complete and all rows are switched to their preferred roles:

[gpadmin@gpdbnew-m ~]$ gpstate -e 
20220418:16:43:34:008909 gpstate:gpdbnew-m:gpadmin-[INFO]:-Starting gpstate with args: -e
20220418:16:43:34:008909 gpstate:gpdbnew-m:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.19.4 build commit:953778b47d418bb463e4abb2d982ba27dd281010'
20220418:16:43:34:008909 gpstate:gpdbnew-m:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.26 (Greenplum Database 6.19.4 build commit:953778b47d418bb463e4abb2d982ba27dd281010) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Mar  9 2022 00:54:06'
20220418:16:43:34:008909 gpstate:gpdbnew-m:gpadmin-[INFO]:-Obtaining Segment details from master...
20220418:16:43:34:008909 gpstate:gpdbnew-m:gpadmin-[INFO]:-Gathering data from segments...
20220418:16:43:35:008909 gpstate:gpdbnew-m:gpadmin-[INFO]:-----------------------------------------------------
20220418:16:43:35:008909 gpstate:gpdbnew-m:gpadmin-[INFO]:-Segment Mirroring Status Report
20220418:16:43:35:008909 gpstate:gpdbnew-m:gpadmin-[INFO]:-----------------------------------------------------
20220418:16:43:35:008909 gpstate:gpdbnew-m:gpadmin-[INFO]:-All segments are running normally

8. We however see that the table still has missing rows:

gpadmin@gpdbnew-m ~]$ psql 
psql (9.4.26)
Type "help" for help.

gpadmin=# select count(*) from test_unlogged;
count 
-------
  2996
(1 row)

Cause:
Unlogged tables do not get accounted in the WAL logs. Therefore when a failover occurs or a segment crash recovery occurs, the table does not undergo its internal startup recovery. Therefore data within an unlogged table needs to be reloaded from source when a segment failover or crash recovery occurs.

Tips:
1. When performing hardware maintenance, always check if unlogged tables need to be copied out by identifying the unlogged tables using the SQL mentioned in the "How to identify unlogged tables" section.
2. "Full" recovery of the segment too does not result in the mirror getting the data copied. This too will show the mirror copy having 0 bytes.

Note:
For Append-only unlogged tables, you might see the following error when a truncate is run on the unlogged AO table and the segment fails over to the mirror node. This can be ignored as it is to be improved in future versions of GPDB. Similar to earlier cases, users need to truncate and reload the unlogged table from source after segment is recovered.:

gpadmin=# select * from test_unlogged_ao;
ERROR:  could not open file "base/16384/65582": No such file or directory  (seg1 slice1 192.168.99.102:35007 pid=28896)

Environment

Product Version: 6.0

Additional Information

https://gpdb.docs.pivotal.io/latest/ref_guide/sql_commands/CREATE_TABLE.html

Feedback

thumb_up Yes

thumb_down No