Symptoms:
vmware-vpostgres
fails to start on vCenter Server.Failed to connect to database: ODBC error: (08001) - [unixODBC]could not connect to server: Connection refused
--> Is the server running on host "localhost" (127.0.0.1) and accepting
--> TCP/IP connections on port 5432
var/log/vmware/vpostgres/postgresql-xx.log
/var/log/vmware/vpxd/vpxd.log
you may see entries similar toyyyy-mm-ddThh:mm:ss error vpxd[35339] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Failed to connect to database: ODBC error: (08001) - [unixODBC]could not connect to server: Connection refused
--> Is the server running on host "localhost" (127.0.0.1) and accepting
--> TCP/IP connections on port 5432?
--> Retry attempt: 16305 ...
/var/log/vmware/vmon/vmon-syslog.log
doesn't indicate why vmware-vpostgres
is not starting.yyyy-mm-ddThh:mm:ss notice vmon Received start request for vmware-vpostgres
yyyy-mm-ddThh:mm:ss notice vmon <vmware-vpostgres-prestart> Constructed command: /opt/vmware/vpostgres/current/scripts/pg_pre_start
yyyy-mm-ddThh:mm:ss notice vmon Executing service batch op API_HEALTH. IgnoreFail=1, service count=10
yyyy-mm-ddThh:mm:ss notice vmon <vapi-endpoint-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-vmon/vmonApiHealthCmd.py -n vapi-endpoint -u /vapiendpoint/health -t 30
yyyy-mm-ddThh:mm:ss notice vmon <rhttpproxy-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-rhttpproxy/rhttpproxy-vmon-apihealth.py
yyyy-mm-ddThh:mm:ss notice vmon <vmware-vpostgres> Skip service health check. State STOPPED, Curr request 1
yyyy-mm-ddThh:mm:ss notice vmon <vcha> Skip service health check. State STOPPED, Curr request 0
2020-07-07T20:33:03.041535+00:00 notice vmon <vmware-postgres-archiver> Skip service health check. State STOPPED, Curr request 0
yyyy-mm-ddThh:mm:ss notice vmon <vpxd-svcs> Skip service health check. State STOPPED, Curr request 0
yyyy-mm-ddThh:mm:ss notice vmon <vpxd> Skip service health check. State STOPPING, Curr request 1
yyyy-mm-ddThh:mm:ss notice vmon <sps> Skip service health check. State STOPPED, Curr request 0
yyyy-mm-ddThh:mm:ss notice vmon <rbd> Skip service health check. State STOPPED, Curr request 0
yyyy-mm-ddThh:mm:ss notice vmon <pschealth> Skip service health check. State STOPPED, Curr request 0
yyyy-mm-ddThh:mm:ss notice vmon Successfully executed service batch operation API_HEALTH.
/var/log/vmware/vmon/vmon.log
you see this (grep
ing for vpostgres
is recommended)yyyy-mm-ddThh:mm:ss Wa(03) host-xxxx <vmware-vpostgres> Service pre-start command's stderr: Generating /storage/db/vpostgres_ssl/root_ca.pem using store TRUSTED_ROOTS
yyyy-mm-ddThh:mm:ss Wa(03) host-xxxx <vmware-vpostgres> Service pre-start command's stderr: Grabbing alias list for store TRUSTED_ROOTS, attempt 1
yyyy-mm-ddThh:mm:ss Wa(03) host-xxxx <vmware-vpostgres-prestart> SysProcess exec timed out. Force kill. Pid ####
yyyy-mm-ddThh:mm:ss Er(02) host-xxxx <vmware-vpostgres> Service pre-start command failed with exit code 1.
/var/log/vmware/vpxd-svcs/vpxd-svcs.log
you may see the below errorSQL Error: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.)
VMware vCenter Server 6.x
VMware vCenter Server 7.x
VMware vCenter Server 8.x
This is caused due to corrupted certificates under /etc/ssl/certs , which causes an unexpectedly high number of certificate entries in TRUSTED_ROOT_CRLS store.
To confirm the cause of the issue, run the below command on the VCSA. If you are using an external PSC, run the following command on the vCenter and PSC both:
# /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store TRUSTED_ROOT_CRLS | grep Number
Output should look like:
Number of entries in store : xxxx
Notes:
To resolve this issue, remove the extra entries in the TRUSTED_ROOT_CRLS store following the below steps:
hsh -s /bin/bash root
as per above the link.Host is not communicating for more than 15 seconds. If the problem repeats, try turning off 'Optimize connection buffer size'.
or
Cannot initialize SFTP protocol. Is the host running an SFTP server?
# cd /tmp
# chmod +x crl-fix.sh
# ./crl-fix.sh
bash: ./crl-fix.sh: /bin/bash^M: bad interpreter: No such file or directory
# sed -i -e 's/$//' crl-fix.sh
# service-control --stop --all
# service-control --start --all