Removing Fragmentation (Bloat) on a Relation Using Redistribute in Greenplum database
search cancel

Removing Fragmentation (Bloat) on a Relation Using Redistribute in Greenplum database

book

Article ID: 295391

calendar_today

Updated On:

Products

VMware Tanzu Greenplum VMware Tanzu Data Suite VMware Tanzu Data Suite

Issue/Introduction

Before beginning, it is highly recommended to read this How to deal with Bloat in Heap tables in VMware Tanzu Greenplum to understand what bloat is.
Bloat can also be removed using vacuum command.

Please read this Different Options to Remove Bloat from a Table on Tanzu Greenplum for information on other methods to remove bloat.

VACUUM FULL and REORGANIZE do have differences in the context of removing bloat.

  • VACUUM FULL's ability to remove bloat is heavily dependent on concurrent activities in the system (including read-only queries. Long running queries can cause some bloat not to be removed)
  • REORGANIZE on other hand has no such dependency and will always remove bloat (caveat is that it is not MVCC safe*).

REORGANIZE can be used to remove bloat and can often be a faster method to remove bloat. VACUUM & VACUUM FULL should be used if MVCC* is more important.

* not MVCC safe - After a table rewrite, the table will appear empty to concurrent transactions, if the query is using a snapshot taken before the rewrite occurred. If this is important, we recommend to use VACUUM FULL.

 

Environment


Resolution

To redistribute a table: 

ALTER TABLE <table_name> SET WITH (REORGANIZE=true); 

This will cause the data to be rewritten to new files with the current distribution key.


Additional Information