Show user tables that have never had a VACUUM or ANALYZE, or last had these operations more than a time interval ago
search cancel

Show user tables that have never had a VACUUM or ANALYZE, or last had these operations more than a time interval ago

book

Article ID: 296725

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

This article describes how to leverage the pg_stat_user_tables table to list user tables in the Tanzu Greenplum cluster that have never had a VACUUM or ANALYZE, or last had the operations performed more than a time interval ago.

The scripts here are intended as a supplement to the gp_toolkit.gp_stats_missing and gp_toolkit.gp_bloat_diag views that show tables that have bloat and missing statistics - the views are documented at https://gpdb.docs.pivotal.io/6-18/ref_guide/gp_toolkit.html


Note: Verify the scripts mentioned in this article on a test cluster before running it on production or important clusters/or database.

Environment

Product Version: 6.17

Resolution

VACUUM:
  • Show user tables that have never had a VACUUM, or had their last VACUUM more than a defined interval ago - the interval in the example below is set to 1 week.
  • The 'gp_toolkit' schema is excluded.
  • User tables that have never had a VACUUM show a blank 'last_vacuum' - ordering 'by 2 DESC' shows these user tables first.
brianh=# select now();
              now              
-------------------------------
 2022-01-11 15:52:42.221707+00
(1 row)

brianh=# SELECT   schemaname || '.' || relname AS table,
         last_vacuum
FROM     pg_stat_user_tables
WHERE    now() - last_vacuum > interval '1 week'
OR       last_vacuum IS NULL
AND      schemaname!='gp_toolkit'
ORDER BY 2 DESC;
     table      |          last_vacuum          
----------------+-------------------------------
 public.aabbcc  | 
 public.aabb    | 
 public.discogs | 2021-12-07 13:46:51.944288+00
 public.aa      | 2021-12-07 13:27:33.543288+00
(4 rows)

brianh=# 

ANALYZE:
  • Show user tables that have never had an ANALYZE, or had their last ANALYZE more than a defined interval ago - the interval in the example below is set to 1 week.
  • The 'gp_toolkit' schema is excluded.
  • User tables that have never had an ANALYZE show a blank 'last_analyze' - ordering 'by 2 DESC' shows these user tables first.
brianh=# select now();
              now              
-------------------------------
 2022-01-11 15:55:13.871748+00
(1 row)

brianh=# SELECT   schemaname || '.' || relname AS table,
         last_analyze
FROM     pg_stat_user_tables
WHERE    now() - last_analyze > interval '1 week'
OR       last_analyze IS NULL
AND      schemaname!='gp_toolkit'
ORDER BY 2 DESC;
     table      |         last_analyze          
----------------+-------------------------------
 public.aabb    | 
 public.aabbcc  | 2021-12-07 13:52:35.436483+00
 public.aa      | 2021-12-07 13:27:37.45639+00
 public.discogs | 2021-11-23 18:38:41.043254+00
(4 rows)

brianh=#