The Site Recovery Manager (SRM) service failed to start, resulting in connection errors with the HBR Manager Server and vCenter. Symptoms included SRM initialization failures and error messages indicating an unresponsive PostgreSQL database. Log analysis confirmed that the PostgreSQL service was not running, preventing SRM from establishing necessary connections.
You may observe the following Error(s):
Red Banner:
"<SRM Replication Server FQDN>.com is unreachable"
And
"Unable to retrieve pairs from extension server at https://<server name>:8043. Unable to connect to HBR Manager Server .... Reason: java.net.ConnectException: Connection refused..."
Site Recovery Manager 9.x
VMware vCenter Server 8.x
The root cause was the absence or corruption of the PostgreSQL log file (postgresql.log). This file is critical for PostgreSQL startup processes. Analysis revealed that the log file had been deleted or modified, leading to a failure in the PostgreSQL service initialization. Incorrect ownership or permissions on the log file (if present) further exacerbated the issue.
To resolve this issue, follow these steps:
Recreate the PostgreSQL Log File:
Create a new postgresql.log file using a text editor or command-line tools:
touch /var/log/vmware/vrmsdb/postgresql.log
Set Correct Ownership and Permissions:
Assign ownership to the PostgreSQL user:
chown vmrsdb:vmrsdb /var/log/vmware/vmrsdb/
Reboot the host system to ensure PostgreSQL starts correctly and SRM initializes without errors.
Verify Service Status:
Confirm that the PostgreSQL service is running:
systemctl status postgresql
Check SRM service logs for successful initialization and connectivity to vCenter and HBR Manager Server.
This issue may occur in environments where manual log management or file system modifications are performed without proper safeguards. Ensure PostgreSQL configuration files and log directories are protected from unintended alterations.
For systems with automated logging, verify that log rotation policies do not inadvertently delete critical files. Regularly back up PostgreSQL configurations and logs to mitigate similar failures.
If the issue persists after resolution, review PostgreSQL startup logs (/var/log/messages or journalctl -u postgresql) for additional error messages.