Benefits of doing regular VACUUMs to clear up bloat in Postgres
search cancel

Benefits of doing regular VACUUMs to clear up bloat in Postgres

book

Article ID: 296405

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

This article highlights why it is recommend to do regular vacuums to clear up bloat in order to save disk space. 

There's generally a confusion on the difference between regular VACUUM and VACUUM FULL. For more details, refer to 24.1. Routine Vacuuming.

The main differences between VACUUM and VACUUM FULL in regards to this article are:

VACUUM

  • Acquires a less restrictive SHARE UPDATE EXCLUSIVE lock so it can be run while the database is online.
  • Removes dead row versions, but does not reclaim disk space except when it's at the end of the file and proper lock is obtained.


VACUUM FULL

  • Acquires a more restrictive EXCLUSIVE lock so we generally recommend running it during maintenance windows.
  • Compacts the table by rewriting into more efficient table file which releases free space back to disk.


Since Postgres 8.1, there is an autovacuum daemon that you can activate which automates routine vacuums. For more information, refer to 24.1.6. The Autovacuum Daemon.

If your user tables are not regularly vacuumed, the size of the table on disk can grow quite fast and will lead to table bloat. This usually requires a vacuum full, which can take a very long time. Running regular vacuums can prevent this.

We will use these queries to show the space differences.

Table DDL

create table users (id int, name text);

postgres=# \d+ users
                                   Table "public.users"
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 id     | integer |           |          |         | plain    |              | 
 name   | text    |           |          |         | extended |              | 
Access method: heap

Number of rows
SELECT oid,relname,relpages,
        pg_stat_get_live_tuples(oid) AS n_live_tup
      , pg_stat_get_dead_tuples(oid) AS n_dead_tup
 from pg_class where relname ='users' or oid in  (select reltoastrelid from  pg_class where relname ='users');
  oid  |    relname     | relpages | n_live_tup | n_dead_tup 
-------+----------------+----------+------------+------------
 18497 | pg_toast_18494 |        0 |          0 |          0
 18494 | users          |        0 |          0 |          0
(2 rows)

Table Sizes
SELECT *, pg_size_pretty(total_bytes) AS total
     , pg_size_pretty(index_bytes) AS INDEX
     , pg_size_pretty(toast_bytes) AS toast
     , pg_size_pretty(table_bytes) AS TABLE
   FROM (
   SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes FROM (
       SELECT c.oid,nspname AS table_schema, relname AS TABLE_NAME
               , c.reltuples AS row_estimate
               , pg_total_relation_size(c.oid) AS total_bytes
               , pg_indexes_size(c.oid) AS index_bytes
               , pg_total_relation_size(reltoastrelid) AS toast_bytes
           FROM pg_class c
           LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
           WHERE (relname = 'users' or c.oid in  (select reltoastrelid from  pg_class where relname ='users') )
   ) a
 ) a
 order by row_estimate desc limit 5 ;
 
  oid  | table_schema |   table_name   | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes |   total    |   index    |   toast    |  table  
-------+--------------+----------------+--------------+-------------+-------------+-------------+-------------+------------+------------+------------+---------
18497 | pg_toast     | pg_toast_18497 |            0 |        8192 |        8192 |             |           0 | 8192 bytes | 8192 bytes |            | 0 bytes
18494 | public       | users          |            0 |        8192 |           0 |        8192 |           0 | 8192 bytes | 0 bytes    | 8192 bytes | 0 bytes
(2 rows)


Environment

Product Version: 13.2

Resolution

Example with no regular VACUUM


1. Insert 5000 rows:

insert into users values(generate_series(1,5000), generate_random_string(10000));

2. Check number of table rows and space occupied:
  oid  |    relname     | relpages | n_live_tup | n_dead_tup 
-------+----------------+----------+------------+------------
 18494 | users          |       32 |       5000 |          0
 18497 | pg_toast_18494 |     7500 |      30000 |          0
(2 rows)

  oid  | table_schema |   table_name   | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total |  index  | toast | table  
-------+--------------+----------------+--------------+-------------+-------------+-------------+-------------+-------+---------+-------+--------
 18497 | pg_toast     | pg_toast_18494 |        30000 |    62169088 |      688128 |             |    61480960 | 59 MB | 672 kB  |       | 59 MB
 18494 | public       | users          |         5000 |    62464000 |           0 |    62169088 |      294912 | 60 MB | 0 bytes | 59 MB | 288 kB
(2 rows)

3. Delete 400 rows and then insert 400 rows:
postgres=# delete from users where id < 401;
DELETE 400
postgres=# insert into users values(generate_series(1,400), generate_random_string(10000));
INSERT 0 400

4. Check number of rows and disk space:
  oid  |    relname     | relpages | n_live_tup | n_dead_tup 
-------+----------------+----------+------------+------------
 18494 | users          |       35 |       5000 |        400
 18497 | pg_toast_18494 |     7500 |      30000 |       2400
(2 rows)

  oid  | table_schema |   table_name   | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total |  index  | toast | table  
-------+--------------+----------------+--------------+-------------+-------------+-------------+-------------+-------+---------+-------+--------
 18497 | pg_toast     | pg_toast_18494 |        30000 |    67141632 |      745472 |             |    66396160 | 64 MB | 728 kB  |       | 63 MB
 18494 | public       | users          |         5000 |    67461120 |           0 |    67141632 |      319488 | 64 MB | 0 bytes | 64 MB | 312 kB
(2 rows)

As we can see, the size of the table increased from 60 MB to 64 MB. Now let's do the same example with a regular vacuum after the delete.


Example with regular VACUUM

After truncating the table to clear up the status, we have this starting state:
  oid  |    relname     | relpages | n_live_tup | n_dead_tup 
-------+----------------+----------+------------+------------
 23911 | pg_toast_23908 |        0 |          0 |          0
 23908 | users          |        0 |          0 |          0
(2 rows)

1. Insert 5000 rows:
postgres=# insert into users values(generate_series(1,5000), generate_random_string(10000));
INSERT 0 5000

2. Row and table sizes after initial insert:
  oid  |    relname     | relpages | n_live_tup | n_dead_tup 
-------+----------------+----------+------------+------------
 23911 | pg_toast_23908 |     7500 |      30000 |          0
 23908 | users          |       32 |       5000 |          0
(2 rows)

  oid  | table_schema |   table_name   | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total |  index  | toast | table  
-------+--------------+----------------+--------------+-------------+-------------+-------------+-------------+-------+---------+-------+--------
 23911 | pg_toast     | pg_toast_23908 |        30000 |    62169088 |      688128 |             |    61480960 | 59 MB | 672 kB  |       | 59 MB
 23908 | public       | users          |         5000 |    62464000 |           0 |    62169088 |      294912 | 60 MB | 0 bytes | 59 MB | 288 kB
(2 rows)

3. Delete 400 rows and then VACUUM the table:
postgres=# delete from users where id < 401;
DELETE 400
postgres=# vacuum users;
VACUUM
4. Now insert 400 new rows. At this point we will have 5000 rows, which is the same as the previous section:
postgres=# insert into users values(generate_series(1,400), generate_random_string(10000));
INSERT 0 400

  oid  |    relname     | relpages | n_live_tup | n_dead_tup 
-------+----------------+----------+------------+------------
 23911 | pg_toast_23908 |     7500 |      30000 |          0
 23908 | users          |       32 |       5000 |          0
(2 rows)

  oid  | table_schema |   table_name   | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total |  index  | toast | table  
-------+--------------+----------------+--------------+-------------+-------------+-------------+-------------+-------+---------+-------+--------
 23911 | pg_toast     | pg_toast_23908 |        27600 |    62226432 |      745472 |             |    61480960 | 59 MB | 728 kB  |       | 59 MB
 23908 | public       | users          |         5000 |    62521344 |           0 |    62226432 |      294912 | 60 MB | 0 bytes | 59 MB | 288 kB
(2 rows)

Notice these same 5000 rows only takes up 60 MB instead of 64 MB without regular vacuum. Now imagine this scenario happening a few hundred times in normal transaction processing and these space differences can add up. That's why it is important to work in regular vacuums periodically to avoid table bloat.

In this next section we can show an example where if you don't routinely vacuum, only a vacuum full would be able to reclaim disk space.


Vacuum Full Example

Truncating the table and starting with a fresh table:
  oid  |    relname     | relpages | n_live_tup | n_dead_tup 
-------+----------------+----------+------------+------------
 34727 | users          |        0 |          0 |          0
 34730 | pg_toast_34727 |        0 |          0 |          0
(2 rows)

1. Add 5000 rows and check the state:
postgres=# insert into users values(generate_series(1,5000), generate_random_string(10000));
INSERT 0 5000

  oid  |    relname     | relpages | n_live_tup | n_dead_tup 
-------+----------------+----------+------------+------------
 34727 | users          |        0 |       5000 |          0
 34730 | pg_toast_34727 |        0 |      30000 |          0
(2 rows)

  oid  | table_schema |   table_name   | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total |  index  | toast | table  
-------+--------------+----------------+--------------+-------------+-------------+-------------+-------------+-------+---------+-------+--------
 34730 | pg_toast     | pg_toast_34727 |        30000 |    62169088 |      688128 |             |    61480960 | 59 MB | 672 kB  |       | 59 MB
 34727 | public       | users          |         5000 |    62464000 |           0 |    62169088 |      294912 | 60 MB | 0 bytes | 59 MB | 288 kB
(2 rows)

2. Delete 400 rows and insert 400 rows after without VACUUM and check the state:
postgres=# delete from users where id < 401;
DELETE 400
postgres=# insert into users values(generate_series(1,400), generate_random_string(10000));
INSERT 0 400

  oid  |    relname     | relpages | n_live_tup | n_dead_tup 
-------+----------------+----------+------------+------------
 34727 | users          |       32 |       5000 |        400
 34730 | pg_toast_34727 |     7500 |      30000 |       2400
(2 rows)

  oid  | table_schema |   table_name   | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total |  index  | toast | table  
-------+--------------+----------------+--------------+-------------+-------------+-------------+-------------+-------+---------+-------+--------
 34730 | pg_toast     | pg_toast_34727 |        30000 |    67141632 |      745472 |             |    66396160 | 64 MB | 728 kB  |       | 63 MB
 34727 | public       | users          |         5000 |    67461120 |           0 |    67141632 |      319488 | 64 MB | 0 bytes | 64 MB | 312 kB
(2 rows)

3. Now try to VACUUM the users table:
postgres=# vacuum users;
VACUUM

  oid  | table_schema |   table_name   | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total |  index  | toast | table  
-------+--------------+----------------+--------------+-------------+-------------+-------------+-------------+-------+---------+-------+--------
 34730 | pg_toast     | pg_toast_34727 |        30000 |    67141632 |      745472 |             |    66396160 | 64 MB | 728 kB  |       | 63 MB
 34727 | public       | users          |         5000 |    67461120 |           0 |    67141632 |      319488 | 64 MB | 0 bytes | 64 MB | 312 kB
(2 rows)

You can see the space did not get reduced. The transactions will be reused, but in this specific case, a VACUUM FULL is needed to reclaim disk space:
postgres=# vacuum full users;
VACUUM

  oid  | table_schema |   table_name   | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total |  index  | toast | table  
-------+--------------+----------------+--------------+-------------+-------------+-------------+-------------+-------+---------+-------+--------
 34727 | public       | users          |         5000 |    62390272 |           0 |    62128128 |      262144 | 60 MB | 0 bytes | 59 MB | 256 kB
 34730 | pg_toast     | pg_toast_34727 |            0 |    62128128 |      688128 |             |    61440000 | 59 MB | 672 kB  |       | 59 MB
(2 rows)


As expected, we reclaimed the space from the 400 deleted rows!
 

Additional Information