vpxd service fails to start due to Postgres corruption
search cancel

vpxd service fails to start due to Postgres corruption

book

Article ID: 408153

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • The issue occurs after a storage issue on the datastore where the vCenter VM resides.
  • Upon rebooting the vCenter, you might encounter errors indicating file system corruption.
  • After implementing the steps provided in How to Fix Corruption on the vCenter Server Appliance, the vCenter VM boots up successfully; however vpxd service fails to start.
  • Attempts to manually start the vpxd service also fail. 
  • In /var/log/vmware/vpostgres/postgresql.log, you see messages similar to:
    [YYYY-MM-DDTHH:MM:SS] 689cf970.452c 0 VCDB vc [local] 17708 3 ERROR:  could not open file "pg_tblspc/16394/PG_13_########/#####/########": No such file or directory
    [YYYY-MM-DDTHH:MM:SS] 689cf970.452c 0 VCDB vc [local] 17708 4 STATEMENT:  BEGIN;declare "SQL_CUR0x7f45d400bac0" cursor with hold for  SELECT EVENT_ID, CHAIN_ID, EVENT_TYPE, EXTENDED_CLASS, CREATE_TIME, USERNAME, CATEGORY, VM_ID,        VM_NAME, HOST_ID, HOST_NAME, COMPUTERESOURCE_ID, COMPUTERESOURCE_TYPE,        COMPUTERESOURCE_NAME, DATACENTER_ID, DATACENTER_NAME, DATASTORE_ID, DATASTORE_NAME, NETWORK_ID,        NETWORK_NAME, NETWORK_TYPE, DVS_ID, DVS_NAME, STORAGEPOD_ID, STORAGEPOD_NAME, CHANGE_TAG_ID FROM VPXV_EVENT_ALL  WHERE (EVENT_ID IN (########,########,########,########,########,########,########,########)) LIMIT 8; ;fetch 1024 in "SQL_CUR0x7f45d400bac0"

Cause

Such corruption often arises from abrupt storage interruptions or improper shutdowns, leading to inconsistencies and, in some cases, the database integrity as well. 

Resolution

Follow the below steps to fix the corruption issue:

  • Stop the vCenter services:
    service-control --stop --all

  • Start the Postgres service:
    service-control --start vmware-vpostgres

Run commands below (Generally, VACUUM FULL fixes mild corruptions).

/opt/vmware/vpostgres/current/bin/vacuumdb -U postgres -d VCDB --full -–analyze

/opt/vmware/vpostgres/current/bin/reindexdb -U postgres -d VCDB

vacuum full;

  • Run a DB check to identify if PostgreSQL has any issues
  • If VACUUM FULL does not resolve corruption issues, the environment will need to be restored using a valid backup. 

  •  If restoring is unable to resolve the corruption issue, redeploy the vCenter as the environment is beyond repair and impacting the database integrity.