How to detect the AO table bloat on the GPDB6 (version before 6.29)
search cancel

How to detect the AO table bloat on the GPDB6 (version before 6.29)

book

Article ID: 381974

calendar_today

Updated On:

Products

Greenplum VMware Tanzu Greenplum

Issue/Introduction

For the version before 6.29, GPDB does not support the user to detect the bloat via the tables pg_stat_all_tables/ pg_stat_user_tables/gp_bloat_diag. On version 6.29, we started to support it.

Resolution

GPDB 6 Method


For Greenplum Database 6, we can use the following PL/pgSQL script to analyze Append-Optimized (AO) tables for potential bloat:

gpadmin=# DO $$
gpadmin$# DECLARE
gpadmin$#     r record;
gpadmin$#     schema_name text;
gpadmin$#     table_name text;
gpadmin$#     tupcount_result bigint;
gpadmin$#     bloat_ratio numeric;
gpadmin$#     formatted_ratio text;
gpadmin$# BEGIN
gpadmin$#     FOR r IN (
gpadmin$#         SELECT
gpadmin$#             c.oid,
gpadmin$#             c.relname,
gpadmin$#             c.reltuples,
gpadmin$#             a.segrelid::oid::regclass::text as ao_seg_table_name
gpadmin$#         FROM pg_class c, pg_appendonly a
gpadmin$#         WHERE c.relkind = 'r'
gpadmin$#         AND a.relid=c.oid
gpadmin$#         AND array_to_string(c.reloptions, ',') LIKE '%appendonly=true%'
gpadmin$#     ) LOOP
gpadmin$#         -- Split the schema and table name
gpadmin$#         schema_name := split_part(r.ao_seg_table_name, '.', 1);
gpadmin$#         table_name := split_part(r.ao_seg_table_name, '.', 2);
gpadmin$#
gpadmin$#         -- Get tupcount
gpadmin$#         EXECUTE format('SELECT COALESCE(SUM(tupcount), 0) FROM %I.%I', schema_name, table_name)
gpadmin$#         INTO tupcount_result;
gpadmin$#
gpadmin$#         -- Calculate bloat ratio as percentage
gpadmin$#         IF tupcount_result > 0 THEN
gpadmin$#             bloat_ratio := 100 - (r.reltuples::numeric / tupcount_result::numeric) * 100;
gpadmin$#         ELSE
gpadmin$#             bloat_ratio := 0;
gpadmin$#         END IF;
gpadmin$#
gpadmin$#         formatted_ratio := TRIM(TO_CHAR(ROUND(bloat_ratio, 2), '999999999.99')) || '%';
gpadmin$#
gpadmin$#         -- Output comparison results with formatted percentage
gpadmin$#         RAISE NOTICE 'Table: %, Reltuples: %, Tupcount: %, Bloat Ratio: %, Status: %',
gpadmin$#             r.relname,
gpadmin$#             r.reltuples::bigint,
gpadmin$#             tupcount_result,
gpadmin$#             formatted_ratio,
gpadmin$#             CASE
gpadmin$#                 WHEN tupcount_result = 0 THEN 'Empty Table'
gpadmin$#                 WHEN bloat_ratio > 80 THEN 'Significant Bloat'
gpadmin$#                 WHEN bloat_ratio > 50 and bloat_ratio <= 80 THEN 'Medium Bloat'
gpadmin$#                 ELSE 'No Bloat'
gpadmin$#             END;
gpadmin$#     END LOOP;
gpadmin$# END $$;
NOTICE:  Table: aoco, Reltuples: 0, Tupcount: 0, Bloat Ratio: .00%, Status: Empty Table
NOTICE:  Table: ao, Reltuples: 100, Tupcount: 10100, Bloat Ratio: 99.01%, Status: Significant Bloat
NOTICE:  Table: sales, Reltuples: 0, Tupcount: 0, Bloat Ratio: .00%, Status: Empty Table
NOTICE:  Table: sales_1_prt_1, Reltuples: 0, Tupcount: 0, Bloat Ratio: .00%, Status: Empty Table
NOTICE:  Table: sales_1_prt_2, Reltuples: 0, Tupcount: 0, Bloat Ratio: .00%, Status: Empty Table
NOTICE:  Table: sales_1_prt_3, Reltuples: 0, Tupcount: 0, Bloat Ratio: .00%, Status: Empty Table
NOTICE:  Table: sales_1_prt_31, Reltuples: 0, Tupcount: 0, Bloat Ratio: .00%, Status: Empty Table
NOTICE:  Table: sales_1_prt_32, Reltuples: 10000, Tupcount: 10000, Bloat Ratio: .00%, Status: No Bloat
 
This script provides the following information for each AO table:
  • Table name
  • Number of tuples (reltuples)
  • Actual tuple count
  • Bloat ratio (as a percentage)
  • Bloat status (Empty Table, Significant Bloat, Medium Bloat, or No Bloat)

GPDB 7 Considerations

For Greenplum Database 7, the performance of the above script may be significantly slower. Instead, it's recommended to use the R&D scripts that utilize gp_toolkit.__gp_aoseg_all and gp_toolkit.__gp_aocsseg_all for more efficient analysis of AO table bloat.Note: The specific R&D scripts for GPDB 7 are not provided in this knowledge base entry. Users should consult the official Greenplum documentation or contact Greenplum support for the most up-to-date and efficient methods for analyzing AO table bloat in GPDB 7.

Interpreting Results

  • Empty Table: The table contains no rows.
  • Significant Bloat: More than 80% of the table space is potentially wasted.
  • Medium Bloat: Between 50% and 80% of the table space is potentially wasted.
  • No Bloat: Less than 50% of the table space is potentially wasted.
Tables with significant or medium bloat may benefit from maintenance operations such as vacuum or rewrite to reclaim space and improve query performance.