Issue: Reverting to the Most Recent Checkpoint to Restore vCenter Functionality
Symptoms:
When attempting to start the vpxd service, the following error appears in the vpxd.log:
Failed to connect to database: ODBC error: (08001) - [unixODBC] Could not connect to the server; --> Connection refused [XXX.X.X.XX:XXXX]
Additionally, the PostgreSQL logs indicate the following error:
PANIC: could not locate a valid checkpoint record.
This issue suggests that the vCenter Server database is unable to establish a connection due to corruption or missing checkpoint records in PostgreSQL. Reverting to the most recent valid checkpoint is required to restore vCenter functionality.
Impact/Risks:
There is a potential for data loss when resetting the transaction logs.
Note:
Take a snapshot of the vCenter Server appliance before proceeding.
1. Stop all services including Postgres. service-control --stop --all
2. Switch to vpostgres user.
su vpostgres -s /bin/sh
3. Reset the transaction log.
For VCSA 6.5.x and 6.7.x:
/opt/vmware/vpostgres/current/bin/pg_resetxlog -f /storage/db/vpostgres
For VCSA 7.0.x:
/opt/vmware/vpostgres/current/bin/pg_resetwal -f /storage/db/vpostgres
4. Exit vpostgres user.
exit
5. Start all services
service-control --start --all