How to Identify Append-Optimized Tables that can be Compacted (VACUUMed) in Greenplum Database v4.3
search cancel

How to Identify Append-Optimized Tables that can be Compacted (VACUUMed) in Greenplum Database v4.3

book

Article ID: 295843

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

The purpose of this document is to explain how to identify bloat within append optimized/append optimized column oriented (AO/AOCO) tables that require compaction.

Note:  This document only applies to AO/AOCO tables in v4.3.x.  For heap tables in v4.3.x, please refer to article. This article will also work on AO/AOCO tables in Greenplum 5.x and above.

 

 

Environment


Cause

Append-optimized tables were first introduced in GPDB v.4.3, adding the ability to UPDATE and DELETE data from Append Only tables.   

To accommodate the ability to UPDATE and DELETE data, when an append-optimized table is created, an additional visibility bitmap (visimap) table is created with a B-tree index.  This visimap table contains information on which rows are visible in the append optimized table, including details on the number of records on each segment, how many tuples or rows are visible, and how many are hidden (due to an update or delete).

Since the data is only marked "hidden" during the UPDATE or DELETE operations, these rows still take up disk space and the table will require regular maintenance to reclaim the space by compacting the append-optimized table. 

 

Resolution

The first step is to identify which tables could benefit from compaction.  The gp_toolkit schema provides the function __gp_aovisimap_compaction_info(oid) to display the number of tuples in the on disk data files as compared to the visibility bitmap table and details whether compaction will assist in reclaiming space.

The function utilizes the server configuration parameter gp_appendonly_compaction_threshold to know the threshold in which to flag compaction_possible as true.

The default is 10 or 10%.

After identifying tables which could be compacted, a VACUUM can be run against them to reclaim this space.

The following demonstrates the steps involved in this process.

1.  Query for the oid of the append-only table if the unknown is shown here:

gpadmin=# SELECT oid FROM pg_class WHERE relname='mytable';
oid
-------
47372
(1 row)
2.  Take the oid and query the __gp_aovisimap_compaction_info(oid) function passing that oid.  
 
The command will report back what the gp_append_only_compaction_threshold parameter is set to.
 
The field compaction_possible is equal to TRUE since the percent_hidden field (hidden_tupcount/total_tupcount) is equal to 33.33%.  Or 33.33% of the records are marked hidden in our visimap and can be compacted out of our table.
gpadmin=# SELECT * FROM gp_toolkit.__gp_aovisimap_compaction_info(47372);
NOTICE:  gp_appendonly_compaction_threshold = 10
content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden
---------+----------+---------------------+-----------------+----------------+----------------
      2 |        1 | t                   |           33333 |          99999 |          33.33
(1 row)
3.  Since the table can benefit from compaction, a VACUUM is run against the table as shown below:
gpadmin=# VACUUM mytable;
4.  (Optional) Re-run __gp_aovisimap_compaction_info once again to confirm compaction has occurred.  As expected, compaction_possible is now marked FALSE. This is because there are 0 hidden_tupcount, making the percent_hidden 0.00. This is below the gp_append_compaction_threshold configuration setting:
gpadmin=# SELECT * FROM gp_toolkit.__gp_aovisimap_compaction_info(47372);
NOTICE:  gp_appendonly_compaction_threshold = 10
 content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden
---------+----------+---------------------+-----------------+----------------+----------------
       2 |        1 | f                   |               0 |          66666 |           0.00

 

 



Additional Information

Environment: VMware Tanzu Greenplum 4.3.x