vpxd service fails to start due to Postgres corruption
search cancel

vpxd service fails to start due to Postgres corruption

book

Article ID: 408153

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • The issue occurs after a storage issue on the datastore where the vCenter VM resides.
  • Upon rebooting the vCenter, you might encounter errors indicating file system corruption.
  • After implementing the steps provided in KB 385536, the vCenter VM boots up successfully; however vpxd service fails to start.
  • Attempts to manually start the vpxd service also fail. 
  • In /var/log/vmware/vpostgres/postgresql.log, you see messages similar to:
    [YYYY-MM-DDTHH:MM:SS] 689cf970.452c 0 VCDB vc [local] 17708 3 ERROR:  could not open file "pg_tblspc/16394/PG_13_########/#####/########": No such file or directory
    [YYYY-MM-DDTHH:MM:SS] 689cf970.452c 0 VCDB vc [local] 17708 4 STATEMENT:  BEGIN;declare "SQL_CUR0x7f45d400bac0" cursor with hold for  SELECT EVENT_ID, CHAIN_ID, EVENT_TYPE, EXTENDED_CLASS, CREATE_TIME, USERNAME, CATEGORY, VM_ID,        VM_NAME, HOST_ID, HOST_NAME, COMPUTERESOURCE_ID, COMPUTERESOURCE_TYPE,        COMPUTERESOURCE_NAME, DATACENTER_ID, DATACENTER_NAME, DATASTORE_ID, DATASTORE_NAME, NETWORK_ID,        NETWORK_NAME, NETWORK_TYPE, DVS_ID, DVS_NAME, STORAGEPOD_ID, STORAGEPOD_NAME, CHANGE_TAG_ID FROM VPXV_EVENT_ALL  WHERE (EVENT_ID IN (########,########,########,########,########,########,########,########)) LIMIT 8; ;fetch 1024 in "SQL_CUR0x7f45d400bac0"

Cause

Such corruption often arises from abrupt storage interruptions or improper shutdowns, leading to inconsistencies and, in some cases, the database integrity as well. 

Resolution

To fix the corruption issue, please implement the following steps:

  • Stop the vCenter services:
    service-control --stop --all

  • Start the Postgres service:
    service-control --start vmware-vpostgres

  • Run commands below (Generally, VACUUM FULL fixes mild corruptions).
    /opt/vmware/vpostgres/current/bin/vacuumdb -U postgres -d VCDB --full –analyze
    /opt/vmware/vpostgres/current/bin/reindexdb -U postgres -d VCDB

  • Run a DB check to identify if PostgreSQL has any issues
  • If VACUUM FULL does not resolve corruption issues, the environment will need to be restored. Please identify a good backup copy and restore it. 

  •  If restoring is unable to resolve the corruption issue, redeploy the vCenter as the environment is beyond repair and impacting the database integrity.