Platform Services disk usage is high or very high for component 'postgresql-ha-postgresql' and database 'pace'

search cancel

Platform Services disk usage is high or very high for component 'postgresql-ha-postgresql' and database 'pace'

book

Article ID: 433936

calendar_today

Updated On:

Products

VMware vDefend Firewall with Advanced Threat Prevention VMware vDefend Firewall

Issue/Introduction

SSP may observe "Platform Services Disk usage is high or very high for component postgresql-ha-postgresql" alarm.

In PostgreSQL, because of high transactional churn operations, such as bulk updates during post-upgrade or cert rotations, the automated background cleanup process (autovacuum) may fail to clean the rapidly generated dead tuples. As a result, the physical files on the disk grow significantly larger than the actual live data they contain, leading to the high disk usage alarm.

^{The alarm indicates that the PostgreSQL database used by the Security Services Platform is consuming an abnormally large portion of the allocated persistent storage. If left unaddressed, this can lead to disk exhaustion and subsequent platform outages.}

^{Recommended Action: Follow the instructions in this KB to remediate the issue.}

Environment

SSP 5.0,SSP 5.1.0

Cause

PostgreSQL uses Multiversion Concurrency Control (MVCC). When rows are updated or deleted, they are not immediately removed from the disk; instead, they are marked as "dead tuples."

Under periods of high transactional churn (frequent updates/deletes) within the pace database, the automated background cleanup process (autovacuum) may fail to keep pace with the generation of dead tuples. This results in "table bloat," where the physical files on disk grow significantly larger than the actual live data they contain.

Resolution

To remediate this issue, you must verify the disk usage, identify the bloated tables within the pace database, and reclaim the space using a VACUUM FULL operation.

1. First, confirm that the postgresql-ha-postgresql-0 pod is experiencing high disk utilization on its data volume.

Execute the following command to check the filesystem usage of the postgresql-ha-postgresql-0 pod,

^{k -n nsxi-platform exec postgresql-ha-postgresql-0 -- df -h}

Example response
^{Filesystem                          Size  Used Avail Use% Mounted on}
^{overlay                              98G  8.5G   85G  10% /}
^{tmpfs                                64M     0   64M   0% /dev}
^{shm                                  64M  284K   64M   1% /dev/shm}
^{tmpfs                               6.2G   12M  6.2G   1% /proc/acpi}
^{/dev/mapper/ssp-var+lib+containers   98G  8.5G   85G  10% /etc/hosts}
^{tmpfs                               6.0G  8.0K  6.0G   1% /certs}
^{/dev/sdb                             49G   36G   12G  76% /bitnami/postgresql      <<-----------------}
^{tmpfs                               6.0G   12K  6.0G   1% /run/secrets/kubernetes.io/serviceaccount}

Review the output and check the Use% for the /bitnami/postgresql mount. If the usage is critically high (e.g., >75%), proceed further.

2. Next, log into the postgresql-ha-postgresql-0 pod to confirm the pace database is the primary consumer and identify the specific tables causing the bloat.

Run the following query to list the largest databases by size

^{SELECT datname, pg_size_pretty(pg_database_size(datname))}
^{FROM pg_database}
^{ORDER BY pg_database_size(datname) DESC;}

Example response

^{datname         | pg_size_pretty}
^{-----------------+----------------}
^{pace            | 35 GB     <---------------------}
^{authelialdap    | 9933 kB}
^{authelia        | 9869 kB}
^{alarms          | 9485 kB}
^{clusterapi      | 9181 kB}
^{repmgr          | 8085 kB}
^{upgrade         | 8021 kB}
^{sparkjobmanager | 7925 kB}
^{postgres        | 7909 kB}
^{template0       | 7761 kB}
^{druid           | 7761 kB}
^{template1       | 7761 kB}
^{(12 rows)}

3. If the pace database is consuming a disproportionately large amoun, check the top tables that are filling up this space.
Execute the following query to identify the top 10 tables in the nsx_config schema with the highest estimated dead tuple size

^SELECT
^schemaname,
^relname,
^{pg_size_pretty(pg_total_relation_size(relid)) AS total_size,}
^{pg_size_pretty(pg_relation_size(relid)) AS table_size,}
^{pg_size_pretty(pg_indexes_size(relid)) AS indexes_size,}
^{pg_total_relation_size(relid) AS total_size_bytes,}
^{n_live_tup AS live_rows,}
^{n_dead_tup AS dead_rows,}
^{ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_ratio_pct,}
^{-- Estimate dead tuple size: (table_size / total_tuples) * dead_tuples}
^{pg_size_pretty(}
^{((pg_relation_size(relid)::numeric / NULLIF(n_live_tup + n_dead_tup, 0)) * n_dead_tup)::bigint}
^{) AS estimated_dead_size,}
^last_vacuum,
^{last_autovacuum,}
^{last_analyze,}
^{last_autoanalyze}
^{FROM pg_stat_user_tables}
^{WHERE schemaname = 'nsx_config'}
^{ORDER BY pg_total_relation_size(relid) DESC}
^{LIMIT 10;}

This will list down the top 10 tables having higher dead tuple size (estimated_dead_size). For example,

Example response

^{schemaname |                 relname                 | total_size | table_size | indexes_size | total_size_bytes | live_rows | dead_rows | dead_ratio_pct | estimated_dead_size |          last_vacuum          |        last_autovacuum        |         last_analyze          |       last_autoanalyze}
^{------------+-----------------------------------------+------------+------------+--------------+------------------+-----------+-----------+----------------+---------------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------}
^{nsx_config | normalizeddfwruleconfig                 | 7500 MB    | 6000 MB    | 1500 MB      |       7864320000 |   1000000 |   3000000 |          75.00 | 4500 MB             | 2026-03-15 02:00:00.123456+00 | 2026-03-19 14:22:10.987654+00 | 2026-03-15 02:05:00.123456+00 | 2026-03-19 14:23:10.123456+00}
^{nsx_config | normalizedgroupeffectiveipaddressconfig | 6500 MB    | 5000 MB    | 1500 MB      |       6815744000 |   2000000 |   4000000 |          66.67 | 3333 MB             | 2026-03-10 01:00:00.123456+00 | 2026-03-19 09:11:22.456789+00 | 2026-03-10 01:05:00.123456+00 | 2026-03-19 09:12:00.123456+00}
^{nsx_config | vm_properties_msg                       | 5120 MB    | 4096 MB    | 1024 MB      |       5368709120 |    500000 |   1000000 |          66.67 | 2730 MB             | NULL                          | 2026-03-19 08:45:12.345678+00 | NULL                          | 2026-03-19 08:46:12.345678+00}
^{nsx_config | normalizedcomputeconfig                 | 4096 MB    | 3072 MB    | 1024 MB      |       4294967296 |    300000 |    900000 |          75.00 | 2304 MB             | 2026-03-01 03:00:00.123456+00 | 2026-03-19 22:10:05.123456+00 | 2026-03-01 03:05:00.123456+00 | 2026-03-19 22:11:00.123456+00}
^{nsx_config | virtual_machine_container_msg           | 3072 MB    | 2560 MB    | 512 MB       |       3221225472 |    400000 |    600000 |          60.00 | 1536 MB             | NULL                          | 2026-03-19 18:30:45.123456+00 | NULL                          | 2026-03-19 18:31:00.123456+00}
^{nsx_config | service_entry_l4_port_set_service_entry | 2048 MB    | 1536 MB    | 512 MB       |       2147483648 |   1000000 |   1000000 |          50.00 | 768 MB              | 2026-02-15 04:00:00.123456+00 | 2026-03-19 11:22:33.123456+00 | 2026-02-15 04:05:00.123456+00 | 2026-03-19 11:23:00.123456+00}
^{nsx_config | service_entry                           | 1536 MB    | 1024 MB    | 512 MB       |       1610612736 |    500000 |    500000 |          50.00 | 512 MB              | NULL                          | 2026-03-19 05:15:20.123456+00 | NULL                          | 2026-03-19 05:16:20.123456+00}
^{nsx_config | vni_msg                                 | 1024 MB    | 800 MB     | 224 MB       |       1073741824 |    100000 |    300000 |          75.00 | 600 MB              | 2026-03-10 05:00:00.123456+00 | 2026-03-19 01:10:15.123456+00 | 2026-03-10 05:05:00.123456+00 | 2026-03-19 01:11:15.123456+00}
^{nsx_config | compute_collection                      | 512 MB     | 400 MB     | 112 MB       |        536870912 |     50000 |    100000 |          66.67 | 266 MB              | NULL                          | 2026-03-19 20:45:30.123456+00 | NULL                          | 2026-03-19 20:46:00.123456+00}
^{nsx_config | service                                 | 256 MB     | 200 MB     | 56 MB        |        268435456 |     10000 |     40000 |          80.00 | 160 MB              | 2026-03-01 06:00:00.123456+00 | 2026-03-19 04:20:10.123456+00 | 2026-03-01 06:05:00.123456+00 | 2026-03-19 04:21:10.123456+00}
(10 rows)

Note the tables with the highest estimated_dead_size and dead_ratio_pct. These are our targets for remediation.

Please contact Broadcom Technical support for running the Vaccum command under supervision.

Feedback

thumb_up Yes

thumb_down No