Performance Degradation due to Autovacuum Running on Pivotal Greenplum Database
search cancel

Performance Degradation due to Autovacuum Running on Pivotal Greenplum Database

book

Article ID: 295707

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

Symptoms:

When Autovacuum kicks in (vacuum kicking in automatically), there is sudden degradation of performance witnessed in the database. You can use the below query to check if there is any Autovacuum running on the cluster:

gpssh -f hostfile "ps -ef | grep autovacuum | grep -v grep"

Environment


Cause

Autovacuum kicks in when transaction age in one or more databases goes over 200,000,000 ("autovacuum_freeze_max_age" default, this cannot be altered).

Note: This is "per segment."

When the runs the table statistics are lost i.e columns like pg_class.relpages and pg_class.reltuples are marked 0, resulting in query running slower than usual and causing performance bottleneck.

Resolution

We highly recommend you to upgrade your GP version to 4.2.5.1 or later version as Autovaccum daemon is disabled in these versions.

For GPDB with the version below 4.2.5.1, can use the below workaround to reduce transaction xid age.

  • Find the databases/segments with high age and vacuum the tables that are nearing the transaction age of the database set via autovacuum_freeze_max_age.
  • Use the following SQL to find which database's age on which instance is older than 200 million:
    SELECT datname, age(datfrozenxid) FROM pg_database order by 2 desc; -- master
    SELECT gp_segment_id,datname, age(datfrozenxid) FROM gp_dist_random('pg_database') order by 3 desc; -- segments
    
    
  • Sometimes but not always, the problem could be a result of orphaned temp schemas, these can be discovered using gpcheckcat or using:
    select * from pg_namespace where nspname like 'pg_temp%';
    select * from gp_dist_random('pg_namespace') where nspname like 'pg_temp%';
    
    

Orphan temporary schemas can be removed with:

drop schema pg_temp_<sess_id> cascade;

If you still identify the age to be high, follow the next set of instructions.

  • If the database is relatively small, just vacuum everything in the database:
"vacuumdb " in the shell

This is useful for system databases ("postgres", "template1", "gpperfmon") and smaller user databases.

  • If database is large, for each database/segment identify which tables is/nearing the high transaction age (> 100,000,000) and vacuum them:
    select relname, age(relfrozenxid) from pg_class where relkind ='r' and relstorage != 'x' and  age(relfrozenxid)> 100000000; -- master
    select relname, age(relfrozenxid) from gp_dist_random('pg_class') where relkind ='r' and relstorage != 'x' and  age(relfrozenxid)> 100000000; -- segments
    

Repeat the procedure until the table/relation/database are below the transaction age of 200,000,000 ("autovacuum_freeze_max_age" default )

  • During autovacuum run , some of the relations might have lost statistics, analyze those tables. A quick way to find is if the pg_class.reltuples is marked 0 for your major tables.

Note: The "template0" is marked as "no connections" so does not need to be handled by above procedure.