VPXD service may stop or fail to start due to database corruption.

Products

VMware vCenter Server

Issue/Introduction

Unable to start vmware-vpxd (VMware vCenter Server) service.
The VPXD service crashed continuously and generated core.vpxd under /var/core/
In /var/log/vmware/vpxd/vpxd.log file there are similar error messages:

YYYY-MM-DD HH:MM:SS:MS error vpxd[7F1B2EDB8800] [Originator@6876 sub=MoOptionMgr] Unable to read from '/etc/motd':N7Vmacore23FileIONotFoundExceptionE(FileIO error: Could not find file  : /etc/motd)
YYYY-MM-DD HH:MM:SS:MS  error vpxd[7F1B2EDB8800] [Originator@6876 sub=MoOptionMgr] [OptionMgr] Ignoring unknown entry from DB: alarms.upgraded
YYYY-MM-DD HH:MM:SS:MS error vpxd[7F1B2EDB8800] [Originator@6876 sub=MoOptionMgr] [OptionMgr] Ignoring unknown entry from DB: VirtualCenter.LDAPAdminPrincipal
YYYY-MM-DD HH:MM:SS:MS  error vpxd[7F1B2EDB8800] [Originator@6876 sub=MoOptionMgr] [OptionMgr] Ignoring unknown entry from DB: VirtualCenter.VimWebServicesUrl
2YYYY-MM-DD HH:MM:SS:MS  error vpxd[7F1B2EDB8800] [Originator@6876 sub=MoOptionMgr] [OptionMgr] Ignoring unknown entry from DB: vpxd.hostkey.newIfMissing
YYYY-MM-DD HH:MM:SS:MS  error vpxd[7F1B2EDB8800] [Originator@6876 sub=MoOptionMgr] [OptionMgr] Ignoring unknown entry from DB: vpxd.kmscert.threshold
YYYY-MM-DD HH:MM:SS:MS  error vpxd[7F1B2EDB8800] [Originator@6876 sub=MoOptionMgr] [OptionMgr] Skipping bad entry config.vpxd.enableDebugBrowse from DB. Resetting to default.Exception: vmodl.fault.InvalidArgument

Environment

VMware vCenter Server Appliance 6.5.x
VMware vCenter Server Appliance 6.7.x

VMware vCenter Server Appliance 7.0.x

Cause

The VPXD service failed to start is because there is corrupted entity in database. It is necessary to find out the corrupted entity and clean any instance about it in database to make the VPXD service started.

Resolution

To resolve this issue restore the vCenter or the database from a recent backup.

OR

Workaround:

1. Ensure the vPostgres Service is Running

Run the following command to ensure that the vmware-vpostgres service is running:

# service-control --status --all

2. Check for Disk Space Issues

Ensure that there is sufficient disk space on the vCenter Appliance:

If the disk is full, it could be impacting the performance of the vCenter.

3. Validate the Consistency of vPostgres Database

To check the consistency of the vPostgres database, please follow the steps outlined in the KB article: Validate the Consistency of the vCenter Server Appliance 6.5 and Newer Databases (53062).

To Find Corrupted Entities:

Use the following SQL query to identify any corrupted entries in the vpx_vm table:

select id from vpx_vm;

This will provide a list of ID numbers. Copy the list to a text editor and format the following SQL query for each ID:

select * from vpx_vm where id = <id_num>;
Run these SQL commands. Any errors generated will indicate corruption in the database (e.g., ERROR: missing chunk number 0 for toast value 2528290 in pg_toast_19544).
Once you’ve identified the corrupted entries, remove them from the database using the following SQL commands:

delete from vpx_vm where id = <id_num_found_corruption>; delete from vpx_entity where name = <id_corresponding_name>;

For more detailed steps on removing stale entries from the vCenter database, please refer to the KB: Manually removing a stale VM from the vCSA DB (2148073).

4. Restart vCenter Services

After removing any corrupted entities from the database, stop all vCenter services and attempt to restart them:

By following these steps, we should be able to resolve any database corruption issues and get the vCenter services running as expected.