Platform Services disk usage is high or very high for component 'postgresql-ha-postgresql' and database 'pace'
search cancel

Platform Services disk usage is high or very high for component 'postgresql-ha-postgresql' and database 'pace'

book

Article ID: 433936

calendar_today

Updated On:

Products

VMware vDefend Firewall with Advanced Threat Prevention VMware vDefend Firewall

Issue/Introduction

SSP may observe "Platform Services Disk usage is high or very high for component postgresql-ha-postgresql" alarm.

In PostgreSQL, because of high transactional churn operations, such as bulk updates during post-upgrade or cert rotations, the automated background cleanup process (autovacuum) may fail to clean the rapidly generated dead tuples. As a result, the physical files on the disk grow significantly larger than the actual live data they contain, leading to the high disk usage alarm.

 

The alarm indicates that the PostgreSQL database used by the Security Services Platform is consuming an abnormally large portion of the allocated persistent storage. If left unaddressed, this can lead to disk exhaustion and subsequent platform outages.

Recommended Action: Follow the instructions in this KB to remediate the issue.

Environment

SSP 5.0,SSP 5.1.0

Cause

    
PostgreSQL uses Multiversion Concurrency Control (MVCC). When rows are updated or deleted, they are not immediately removed from the disk; instead, they are marked as "dead tuples."

Under periods of high transactional churn (frequent updates/deletes) within the pace database, the automated background cleanup process (autovacuum) may fail to keep pace with the generation of dead tuples. This results in "table bloat," where the physical files on disk grow significantly larger than the actual live data they contain.

Resolution

To remediate this issue, you must verify the disk usage, identify the bloated tables within the pace database, and reclaim the space using a VACUUM FULL operation.

1. First, confirm that the postgresql-ha-postgresql-0 pod is experiencing high disk utilization on its data volume.

    Execute the following command to check the filesystem usage of the postgresql-ha-postgresql-0 pod,

 

k -n nsxi-platform exec postgresql-ha-postgresql-0 -- df -h
Example response
Filesystem                          Size  Used Avail Use% Mounted on
overlay                              98G  8.5G   85G  10% /
tmpfs                                64M     0   64M   0% /dev
shm                                  64M  284K   64M   1% /dev/shm
tmpfs                               6.2G   12M  6.2G   1% /proc/acpi
/dev/mapper/ssp-var+lib+containers   98G  8.5G   85G  10% /etc/hosts
tmpfs                               6.0G  8.0K  6.0G   1% /certs
/dev/sdb                             49G   36G   12G  76% /bitnami/postgresql      <<-----------------  
tmpfs                               6.0G   12K  6.0G   1% /run/secrets/kubernetes.io/serviceaccount  

Review the output and check the Use% for the /bitnami/postgresql mount. If the usage is critically high (e.g., >75%), proceed further.

 

2. Next, log into the postgresql-ha-postgresql-0 pod to confirm the pace database is the primary consumer and identify the specific tables causing the bloat.

    Run the following query to list the largest databases by size

 

  SELECT datname, pg_size_pretty(pg_database_size(datname))
  FROM pg_database
  ORDER BY pg_database_size(datname) DESC;

 

Example response

 datname         | pg_size_pretty
-----------------+----------------
 pace            | 35 GB     <---------------------
 authelialdap    | 9933 kB
 authelia        | 9869 kB
 alarms          | 9485 kB
 clusterapi      | 9181 kB
 repmgr          | 8085 kB
 upgrade         | 8021 kB
 sparkjobmanager | 7925 kB
 postgres        | 7909 kB
 template0       | 7761 kB
 druid           | 7761 kB
 template1       | 7761 kB
(12 rows) 

 

3. If the pace database is consuming a disproportionately large amoun, check the top tables that are filling up this space.
    Execute the following query to identify the top 10 tables in the nsx_config schema with the highest estimated dead tuple size

 

SELECT 
    schemaname,
    relname,
    pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
    pg_size_pretty(pg_relation_size(relid)) AS table_size,
    pg_size_pretty(pg_indexes_size(relid)) AS indexes_size,
    pg_total_relation_size(relid) AS total_size_bytes,
    n_live_tup AS live_rows,
    n_dead_tup AS dead_rows,
    ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_ratio_pct,
    -- Estimate dead tuple size: (table_size / total_tuples) * dead_tuples
    pg_size_pretty(
        ((pg_relation_size(relid)::numeric / NULLIF(n_live_tup + n_dead_tup, 0)) * n_dead_tup)::bigint
    ) AS estimated_dead_size,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze
FROM pg_stat_user_tables
WHERE schemaname = 'nsx_config'
ORDER BY pg_total_relation_size(relid) DESC
LIMIT 10;

 

This will list down the top 10 tables having higher dead tuple size (estimated_dead_size). For example,

Example response

 schemaname |                 relname                 | total_size | table_size | indexes_size | total_size_bytes | live_rows | dead_rows | dead_ratio_pct | estimated_dead_size |          last_vacuum          |        last_autovacuum        |         last_analyze          |       last_autoanalyze        
------------+-----------------------------------------+------------+------------+--------------+------------------+-----------+-----------+----------------+---------------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------
 nsx_config | normalizeddfwruleconfig                 | 7500 MB    | 6000 MB    | 1500 MB      |       7864320000 |   1000000 |   3000000 |          75.00 | 4500 MB             | 2026-03-15 02:00:00.123456+00 | 2026-03-19 14:22:10.987654+00 | 2026-03-15 02:05:00.123456+00 | 2026-03-19 14:23:10.123456+00
 nsx_config | normalizedgroupeffectiveipaddressconfig | 6500 MB    | 5000 MB    | 1500 MB      |       6815744000 |   2000000 |   4000000 |          66.67 | 3333 MB             | 2026-03-10 01:00:00.123456+00 | 2026-03-19 09:11:22.456789+00 | 2026-03-10 01:05:00.123456+00 | 2026-03-19 09:12:00.123456+00
 nsx_config | vm_properties_msg                       | 5120 MB    | 4096 MB    | 1024 MB      |       5368709120 |    500000 |   1000000 |          66.67 | 2730 MB             | NULL                          | 2026-03-19 08:45:12.345678+00 | NULL                          | 2026-03-19 08:46:12.345678+00
 nsx_config | normalizedcomputeconfig                 | 4096 MB    | 3072 MB    | 1024 MB      |       4294967296 |    300000 |    900000 |          75.00 | 2304 MB             | 2026-03-01 03:00:00.123456+00 | 2026-03-19 22:10:05.123456+00 | 2026-03-01 03:05:00.123456+00 | 2026-03-19 22:11:00.123456+00
 nsx_config | virtual_machine_container_msg           | 3072 MB    | 2560 MB    | 512 MB       |       3221225472 |    400000 |    600000 |          60.00 | 1536 MB             | NULL                          | 2026-03-19 18:30:45.123456+00 | NULL                          | 2026-03-19 18:31:00.123456+00
 nsx_config | service_entry_l4_port_set_service_entry | 2048 MB    | 1536 MB    | 512 MB       |       2147483648 |   1000000 |   1000000 |          50.00 | 768 MB              | 2026-02-15 04:00:00.123456+00 | 2026-03-19 11:22:33.123456+00 | 2026-02-15 04:05:00.123456+00 | 2026-03-19 11:23:00.123456+00
 nsx_config | service_entry                           | 1536 MB    | 1024 MB    | 512 MB       |       1610612736 |    500000 |    500000 |          50.00 | 512 MB              | NULL                          | 2026-03-19 05:15:20.123456+00 | NULL                          | 2026-03-19 05:16:20.123456+00
 nsx_config | vni_msg                                 | 1024 MB    | 800 MB     | 224 MB       |       1073741824 |    100000 |    300000 |          75.00 | 600 MB              | 2026-03-10 05:00:00.123456+00 | 2026-03-19 01:10:15.123456+00 | 2026-03-10 05:05:00.123456+00 | 2026-03-19 01:11:15.123456+00
 nsx_config | compute_collection                      | 512 MB     | 400 MB     | 112 MB       |        536870912 |     50000 |    100000 |          66.67 | 266 MB              | NULL                          | 2026-03-19 20:45:30.123456+00 | NULL                          | 2026-03-19 20:46:00.123456+00
 nsx_config | service                                 | 256 MB     | 200 MB     | 56 MB        |        268435456 |     10000 |     40000 |          80.00 | 160 MB              | 2026-03-01 06:00:00.123456+00 | 2026-03-19 04:20:10.123456+00 | 2026-03-01 06:05:00.123456+00 | 2026-03-19 04:21:10.123456+00
(10 rows)

 

Note the tables with the highest estimated_dead_size and dead_ratio_pct. These are our targets for remediation.

Please contact Broadcom Technical support for running the Vaccum command under supervision.