GenAI installation fails with the error connecting to the Postgres DB and you would see ai-server job failing on the controller VM with a similar error as below:
cat ai-server.stdout.log | grep ERROR | grep 'request timed out' | head -1
10:49:50.660 [tomcat-handler-34] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - HikariPool-1 - Connection is not available, request timed out after 30000ms (total=0, active=0, idle=0, waiting=0)
cat ai-server.stdout.log | grep ERROR | grep 'request timed out' | tail -1
13:16:46.560 [tomcat-handler-278] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - HikariPool-1 - Connection is not available, request timed out after 30000ms (total=0, active=0, idle=0, waiting=0)
This can happen if there is a connectivity issue to the Postgres DB from GenAI deployment network or if there is an issue with DNS resolution of the Postgres hostname from the controller VM, below steps can be performed to troubleshoot and scope this issue.
1) Run cf cli and target genai space and run the below command
cf services
Take the SI name and then fetch the service key using the below command
cf service-key <SI-name> <SI-Key>
Make sure to replace the SI name and Key as per your environment.
2) You will get the DB credentials like similar output
{
"credentials": {
"db": "postgres",
"hosts": [
"q-s0.postgres-instance.infra.service-instance-04d4ba53-7740-40a7-b402-1a4bb41da825.bosh"
],
"jdbcUrl": "jdbc:postgresql://q-s0.postgres-instance.infra.service-instance-04d4ba53-7740-40a7-b402-1a4bb41da825.bosh:5432/postgres?user=pgadmin&password=5zRHRAX0QJDsLJGawOhu80vR3fTnAs",
"password": "5zRHRAX0QJDsLJGawOhu80vR3fTnAs",
"port": 5432,
"primary_host": "04d4ba53-7740-40a7-b402-1a4bb41da825.postgres.service.internal",
"uri": "postgresql://pgadmin:5zRHRAX0QJDsLJGawOhu80vR3fTnAs@q-s0.postgres-instance.infra.service-instance-04d4ba53-7740-40a7-b402-1a4bb41da825.bosh:5432/postgres",
"user": "pgadmin"
}
}
3) ssh to the controller VM in the GenAI deployment using bosh cli and then run the following commands to check the network connectivity
nc -vz q-s0.postgres-instance.infra.service-instance-04d4ba53-7740-40a7-b402-1a4bb41da825.bosh 5432
q-s0.postgres-instance.infra.service-instance-04d4ba53-7740-40a7-b402-1a4bb41da825.bosh is the host mentioned in this example, make sure to replace accordingly.
If it's a connectivity issue, you can allow the same in your firewall or routers and if the Postgres is also a tile then the DNS resolution fails then you need to check the Bosh-DNS.
If it's a DNS resolution issue, you can check if the Bosh-DNS certs are not expired and if cert rotation has been performed if the new certs are pushed to all the VMs and then rerun the Apply changes on the GenAI tile.