vCenter Server becomes unresponsive and/or Postgres database runs out of disk space
search cancel

vCenter Server becomes unresponsive and/or Postgres database runs out of disk space

book

Article ID: 316343

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

In certain environments where the vCenter is very busy, or where a customer may be making many calls that result in queries to the vCenter Server database, the vCenter Server may experience performance issues and/or become unresponsive and require a reboot.

You may also observe the Postgres autovacuum process appearing to be stuck and/or queries taking excessively long periods of time to process.

You may observe /storage/db running out of space.

This has been seen in vCenter Server 8.0, though it may also affect 7.0.

Environment

vCenter Server 8.x
vCenter Server 7.x

Cause

There are many reasons why you may run into symptoms similar to these, and you should fully investigate the issue.  Review other KB's and make sure there aren't other issues that need to be addressed also.

This particular issue is caused by an inadequate setting of the autovacuum_vacuum_cost_limit setting in Postgres, either due to its own setting or by inheriting the value from the vacuum_cost_limit setting.

To check these settings, run the following commands:

show autovacuum_vacuum_cost_limit;
show vacuum_cost_limit;

If the output shows that the value is too low, you will see similar output as below:
 
VCDB=# show autovacuum_vacuum_cost_limit;
autovacuum_vacuum_cost_limit
------------------------------
-1
(1 row)
 
VCDB=# show vacuum_cost_limit;
vacuum_cost_limit
-------------------
200
(1 row)

Resolution

The value of vacuum_cost_limit should be set to 10000.  A performance review of the current default setting of 200 and found it to be inadequate.  The default value has been updated for future releases.

In order adjust the setting, run the following commands:
 
/opt/vmware/vpostgres/current/bin/psql -U postgres -c "ALTER SYSTEM SET vacuum_cost_limit = 10000;"
/opt/vmware/vpostgres/current/bin/psql -U postgres -c "SELECT pg_reload_conf()"

Once this has been done, monitor to see if the issue is resolved.

In case of larger vCenter Server systems, or systems where there is a large amount of database activity, it can also help to shift some of the vacuum operations to off-peak hours by utilizing a cron job to run the following command:
 
/opt/vmware/vpostgres/current/bin/psql -U postgres -d VCDB -c "vacuum(freeze, analyze);"

This should help alleviate some of the work from being done during peak usage times, though it is not necessary in all cases.