VMware Postgres Archiver Service changes status to "stopped"
book
Article ID: 318472
calendar_today
Updated On:
Products
VMware vCenter ServerVMware vCenter Server 7.0VMware vCenter Server 8.0
Issue/Introduction
VMware Postgres Archiver Service changes status to “stopped” after configuring scheduled backups for the vCenter server appliance.
In /var/log/vmware/vpostgres/pg_archiver.log-[n].stderr, you see error similar to:
yyyy-mm-ddThh:mm:ss.mssZ ERROR pg_archiver could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.
In /var/log/vmware/vpostgres/postgresql-[nn].log you see error similar to:
yyyy-mm-ddThh:mm:ss.mssZ UTC 5ae81b07.c7c9 0 [unknown] archiver LOG: terminating walsender process due to replication timeout
vmware-vpostgres service appears to be crashing on startup post patching the vCenter appliance from build 8.0.3.00500 to 8.0.3.00600.
Environment
VMware vCenter Server Appliance 6.7.x, 7.x and 8.x.
Cause
This is caused by the wal_sender_timeout value being too low. If the vCenter Server appliance experiences high CPU load, the Postgres service may stop the archiver, as it cannot report within the timeout limit configured by default.
Resolution
Please note: Take snapshot of vCenter or backup of vCenter prior to making any changes. If the vCenter is part of Enhance Linked mode / VMware Cloud Foundation, make sure the environment has offline snapshots available, refer KB - 85662
Automated Workaround
Download the script attached within this KB named "Update-Postgres-Conf.py"
Using WinScp copy the script to the vCenter to the /tmp/scripts location (Note there is no scripts directory, this is a newly created directory.)
Log on to the vCenter via SSH/putty
Change directory to the /tmp
command - "cd /tmp/scripts"
Execute the script by running the below command
command - python3 Update-Postgre-Conf.py
The script will make the necessary changes to /storage/db/vpostgres/postgresql.conf
If the value of "wal_sender_timeout" is already at 600s the script will not make any changes to the file.
Manual Workaround
1. Accessing the PostgreSQL Configuration File via SSH
Establish SSH Connection:
Open PuTTY.
In the "Session" category, enter the hostname or IP address of your vCenter server in the "Host Name (or IP address)" field.
In the "Connection" -> "SSH" -> "Auth" category, browse and select your private key file if you're using SSH key-based authentication. Otherwise, enter the root password.
Click "Open" to establish the SSH connection.
Navigate to the Configuration File:
Once connected, use the cd command to navigate to the PostgreSQL configuration file directory:
cd /storage/db/vpostgres/
Open the Configuration File:
Use a suitable text editor with root privileges.
Using vi:
vi postgresql.conf
Utilities of vi command:
Navigate within the file using arrow keys.
Press i to enter insert mode.
Make the necessary changes.
Press Esc to exit insert mode.
Type :wq and press Enter to save and quit.
2. Uncomment and Modify the wal_sender_timeout Parameter
Locate the Parameter:
Search for the line: #wal_sender_timeout = 60s within the postgresql.conf file.
Uncomment the Line:
Remove the leading # character to activate the parameter.
Modify the Value:
Change the value from 60s to 600s:
wal_sender_timeout = 600s
This increases the timeout for the write-ahead log (WAL) sender from 60 seconds to 600 seconds.
3. Save the Changes
Using vi:
Press :wq! and Enter to save the changes and exit the editor.
4. Restart PostgreSQL Services:
Important: Ensure no active tasks are running on the vCenter Server to avoid data inconsistencies.
Stop the PostgreSQL services: service-control --stop vmware-postgres-archiver vmware-vpostgres
Start the PostgreSQL services: service-control --start vmware-postgres-archiver vmware-vpostgres
This adjustment increases the timeout for the write-ahead log (WAL) sender, potentially resolving issues related to slow or intermittent network connections that may cause timeouts during data replication.
Additional Information
How to stop, start, or restart vCenter Server 6.x services and above