/storage/seat disk 100% full on vCenter Server Appliance 6.x/7.x/8.x

search cancel

/storage/seat disk 100% full on vCenter Server Appliance 6.x/7.x/8.x

book

Article ID: 318931

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

The vCenter Server Appliance services fail to start.
When connecting to the VCSA using the vSphere Web Client, there may be an error similar to:
503 Service Unavailable
When connecting to the vSphere UI, the inventory is empty
Running df -h on the vCenter Server Appliance shows the /dev/mapper/seat_vg-seat mounted on /storage/seat as 95% or more full.

`Filesystem`	`Size`	`Used`	`Avail`	`Use%`	`Mounted on`
`/ dev/sda3`	`11G`	`3.9G`	`6.4G`	`38%`
`udev`	`4.0G`	`164K`	`4.0G`	`1%`	`/dev`
`tmpfs`	`4.0G`	`40K`	`4.0G`	`1%`	`/dev/shm`
`/dev/sda1`	`128M`	`38M`	`84M`	`31%`	`/boot`
`/dev/mapper/core_vg-core`	`25G`	`2.7G`	`21G`	`12%`	`/storage/core`
`/dev/mapper/log_vg-log`	`9.9G`	`2.5G`	`7.0G`	`26%`	`/storage/log`
`/dev/mapper/db_vg-db`	`9.9G`	`214M`	`9.2G`	`3%`	`/storage/db`
`/dev/mapper/dblog_vg-dblog`	`5.0G`	`379M`	`4.3G`	`8%`	`/storage/dblog`
`/dev/mapper/seat_vg-seat`	`9.9G`	`9.4G`		`100%`	`/storage/seat`
`/dev/mapper/netdump_vg-netdump`	`1001M`	`18M`	`932M`	`2%`	`/storage/netdump`
`/dev/mapper/autodeploy_vg-autodeploy`	`9.9G`	`151M`	`9.2G`	`2%`	`/storage/autodeploy`
`/dev/mapper/invsvc_vg-invsvc`	`5.0G`	`191M`	`4.5G`	`4%`	`/storage/invsvc`

In the /var/log/vmware/vpxd/vpxd.log file, see entries similar to:

<YYYY-MM-DD>T<TIME> warning vpxd [7F447FF7E700] [Originator@6876 sub=Default opID=StatsTruncateExpiredPartitions-7460ee5a]
[VdbStatement] Connection diagnostic data from driver is HY000:0:110:
<YYYY-MM-DD>T<TIME> error vpxd[7F44865F3700] [Originator@6876 sub=Default] [VdbStatement] Execute result code: -1
<YYYY-MM-DD>T<TIME> warning vpxd[7F44865F3700] [Originator@6876 sub=Default] [VdbStatement] SQL execution failed: INSERT INTO
VPX EVENT (EVENT_ID, CHAIN_ID, EVENT_TYPE, EXTENDED_CLASS, CREATE_TIME, USERNAME, CATEGORY, VM_ID, VM_NAME, HOST_ID, HOST_NAME,
COMPUTERESOURCE_ID, COMPUTERESOURCE TYPE, COMPUTERESOURCE NAME, DATACENTER_ID, DATACENTER NAME, DATASTORE_ID, DATASTORE NAME,
NETWORK_ID, NETWORK_NAME, NETWORK_TYPE, DVS_ID, DVS_NAME, STORAGEPOD_ID, STORAGEPOD_NAME, CHANGE_TAG_ID) VALUES (?, ?, ?, ?, ?,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, ?, ?, ?, ?)
<YYYY-MM-DD>T<TIME> warning vpxd[7F44865F3700] [Originator@6876 sub=Default] [VdbStatement] Execution elapsed time: 3 ms
<YYYY-MM-DD>T<TIME> warning vpxd[7F44865F3700] [Originator@6876 sub=Default] [VdbStatement] Statement diagnostic data from
driver is 53100:0:7:ERROR: could not extend file "pg_tblspc/16396/PG_9.3_201306121/16384/16641": No space left on device;
-- > Error while executing the query

In the /var/log/vmware/vpxd.log file of the vCenter Server Appliance 6.7.x, entries will be similar to:

yyyy-mm-ddThh:mm:ss info vpxd[24517] [Originator@6876 sub=vpxdVdb] WarningThreshold: 80% ErrorThreshold: 95%.
yyyy-mm-ddThh:mm:ss error vpxd[24517] [Originator@6876 sub=vpxdVdb] Insufficient free space for the Database (used: 96%; threshold: 95%)
yyyy-mm-ddThh:mm:ss info vpxd[07808] [Originator@6876 sub=vpxdvpxdSignal] Signal 15 received, exiting
yyyy-mm-ddThh:mm:ss info vpxd[07808] [Originator@6876 sub=Default] Initiating VMware VirtualCenter shutdown
yyyy-mm-ddThh:mm:ss info vpxd[07711] [Originator@6876 sub=Default] Shutting down VMware VirtualCenter

In the /var/log/vmware/vpxd.log file of the vCenter Server 7.x / 8.x, entries will be similar to:

yyyy-mm-ddThh:mm:ss info vpxd[18939] [Originator@6876 sub=vpxdVdb] WarningThreshold: 80% ErrorThreshold: 95%.
yyyy-mm-ddThh:mm:ss error vpxd[18939] [Originator@6876 sub=vpxdVdb] Space used on storage partition of Database exceeds
threshold (used: 95%; threshold: 95%). Service-control request will stop vpxd
yyyy-mm-ddThh:mm:ss info vpxd[18917] [Originator@6876 sub=vpxdvpxdSignal] Signal 15 received, exiting
yyyy-mm-ddThh:mm:ss info vpxd[18917] [Originator@6876 sub=Default] Initiating VMware VirtualCenter shutdown
yyyy-mm-ddThh:mm:ss info vpxd[18442] [Originator@6876 sub=Default] Shutting down VMware VirtualCenter

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on the environment.

Environment

VMware vCenter Server Appliance 6.x
VMware vCenter Server Appliance 7.x

VMware vCenter Server Appliance 8.x

Cause

This issue occurs due to a large amount of events collected on the vCenter Server Appliance filling the database. By default, when the space is 95% or more full, the critical vmware-vpxd service will not be allowed to run to prevent the database from becoming corrupted.

The most frequent cause of this happening is Excessive Hardware health alarms being triggered for "Sensor -1 type" on ESXi hosts running vSphere 6.7/6.5

Resolution

To resolve this issue, it will need to be determined which ESXi host is causing the events table to be filled. After this is done, the events tables must be truncated.

To find the ESXi host that is generating the events:
1. Take a backup of the VCDB, see File-Based Backup and Restore of vCenter Server
2. Take a snapshot of the vCenter Server Appliance.
3. Connect to vCenter Server Appliance through the console or using an SSH session and root credentials.
4. Enable the shell by running this command:
```
shell.set --enabled true
```
5. Type shell and press Enter.
6. Stop the vpxd service and content library service by running:
```
service-control --stop vmware-vpxd && service-control --stop vmware-content-library
```
7. Run this command to log in to the vCenter Server Appliance database:
```
/opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres
```
8. Run this query to determine the source of the events:
  
  vCenter Server Appliance 6.0
```
SELECT COUNT(EVENT_ID) AS NUMEVENTS, EVENT_TYPE, USERNAME FROM VPX_EVENT GROUP BY EVENT_TYPE, USERNAME ORDER BY NUMEVENTS DESC LIMIT 10;
```
  vCenter Server Appliance 6.x/7.x/8.x
```
SELECT COUNT(EVENT_ID) AS NUMEVENTS, EVENT_TYPE, USERNAME FROM VPXV_EVENT_ALL GROUP BY EVENT_TYPE, USERNAME ORDER BY NUMEVENTS DESC LIMIT 10;
```
  Sample output
```
vmfs.heartbeat.timedout |<esxi_ip> | 12191576
vim.event.UserLogoutSessionEvent |<esxi_ip> | 1219121
vim.event.VmAcquiredTicketEvent |<esxi_ip> | 15568
```
9. To reclaim space type \q come out of vCenter Server Appliance database and run the command below
```
su -c "/opt/vmware/vpostgres/current/bin/vacuumdb -d VCDB -e -v -f -U postgres > /tmp/vacuumdb.log"
```
  - Note: If are logged in as Root user, "su -c" is not required.
10. As stated sample output above in step-8, there are about 12191576 events generated from the ESXi host with the IP of <esxi_ip> related to: vmfs.heartbeat.timedout.
11. This host must be further investigated to resolve the issue that is being reported.
To clean up the events in the vCenter Server Appliance database, take an offline snapshot of all of the VCSA VMs in linked mode (All vCenter vms must be offline at the same time while these snapshots are taken), then follow the below steps.

/* The TRUNCATE command permanently deletes all data from a table. If you are unsure whether a specific table contains essential configuration data, contact Broadcom Support for verification before proceeding */
1. From the console or SSH session connected to vCenter Server Appliance, run these commands to truncate the event table data
  
  vCenter Server Appliance 6.0:
```
truncate table vpx_event cascade;
```
```
truncate table vpx_event_arg cascade;
```
  vCenter Server Appliance 6.x/7.x/8.x:
```
VCDB# SELECT nspname || '.' || relname AS "relation", pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND C.relkind <> 'i' AND nspname !~ '^pg_toast' AND (relname ~ '^vpx_event_' OR relname ~ '^vpx_event_arg_' OR relname ~ '^vpx_task') ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 20;
```
  - This will display the top 20 largest tables whose name starts with "vpx_event_", "vpx_event_arg_" or "vpx_task" within the vCenter Server database.
  - Usually truncating only the event/task table data can reduce partition usage on /storage/seat/ to a reasonable level.
  - Truncate large tables individually one by one.
    Sample/Example output:
```
VCDB=# truncate table vc.vpx_event_1 cascade;
```
    Note: The Solution on this KB is truncating only vc.vpx_event_x tables.
2. Exit the vCenter Server Appliance database by running this command:
```
 \q
```
3. Start the vpxd and content library service by running the command:
```
service-control --start vmware-vpxd && service-control --start vmware-content-library
```
4. Verify the space is reclaimed by running the df -h command.
  
  There may be output similar to

Additional Information

VMware Skyline Health Diagnostics for vSphere - FAQ

Excessive Hardware health alarms being triggered for "Sensor -1 type" on ESXi hosts running vSphere 6.7/6.5
Increasing the disk space for the vCenter Server Appliance in vSphere 6.5, 6.7, 7.0 and 8.0

Japanese KB: vCenter Server Appliance 6.x/7.x/8.xで /storage/seat のディスク使用率が 100% になる

Feedback

thumb_up Yes

thumb_down No