Greenplum disk utilization heavily increasing greater than 95% on all hosts

search cancel

Greenplum disk utilization heavily increasing greater than 95% on all hosts

book

Article ID: 396323

calendar_today

Updated On:

Products

VMware Tanzu Data Suite VMware Tanzu Greenplum VMware Tanzu Data Suite

Issue/Introduction

The data segment directories are full or close to 100% full.

gpcheckcat does not report any errors.

You need to check the size of each table in the database.

Resolution

Find the size of all tables with the SQL below.

Running SQL in a segment

The following needs to be run by connecting to a particular segment, it will fail if run on the master/coordinator.

Replace all text in bold and itallics below with appropriate values for the

$ PGOPTIONS='-c gp_session_role=utility' psql -d <DBname> -h <segmentHost> -p <segmentPort>

my_prod_db=# create temp table total_sizes_tmp as (
   select nspname,relname,relfilenode,pg_total_relation_size(c.oid) 
   from pg_class c join pg_namespace n on c.relnamespace = n.oid 
   where c.relkind in ('r') );
SELECT 86845

my_prod_db=# select * from total_sizes_tmp order by 4 desc limit 10;
        nspname        |                relname                | relfilenode | pg_total_relation_size
-----------------------+---------------------------------------+-------------+------------------------
 schemabig             | problem_table                         |   205973934 |           645035622400
 schema01              | table01                               |   205960559 |            22080122584
 schema02              | table02                               |   205950513 |            17920524352
 schema02              | table03                               |   205988229 |            11570190040
 schema03              | table04                               |   205985913 |             8871298528
 :
 :
(10 rows)

The largest tables are listed above.

You investigate the very large tables to see why they are so large and if they can be truncated or dropped to reduce the disk usage.

Running from the master/coordinator

It is possible to run the following from the master/coordinator, but it may take much longer to run as it needs to process all files in the database, rather than the files for just one segment.

It also does not save the result to a table or temp table for further queries.

$ psql -d <DBname>

# select schemaname || '.' || tablename as table, pg_total_relation_size(schemaname || '.' || tablename) from pg_tables order by 2 desc;

Feedback

thumb_up Yes

thumb_down No