Symptoms:
WARN vco [host='vco-app-<id>' thread='http-nio-8280-exec-209' user='<username>' org='-' trace='-'] {} org.hibernate.engine.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 53300
ERROR vco [host='vco-app-<id>' thread='http-nio-8280-exec-209' user='<username>' org='-' trace='-'] {} org.hibernate.engine.jdbc.spi.SqlExceptionHelper - FATAL: too many connections for role "vco-db"
WARN vco [host='vco-app-<id>' thread='http-nio-8280-exec-210' user='<username>' org='-' trace='-'] {} org.hibernate.engine.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 53300
Caused by: org.postgresql.util.PSQLException: FATAL: too many connections for role "vco-db"
INFO vco [host='vco-app-<id>' thread='http-nio-8280-exec-50' user='<username>' org='-' trace='-'] - <IP> - - [dd/mm/yyyy:hh:mm:ss +0000] GET /vco/api/server-configuration/api/category/<ID> HTTP/1.1 200 1343 21784 ms
2023-11-13T12:03:28.924Z INFO vco [host='vco-app-<id>' thread='http-nio-8280-exec-191' user='<username>' org='-' trace='-'] - <IP> - - [dd/mm/yyyy:hh:mm:ss +0000] GET /vco/api/server-configuration/api/category/<ID> HTTP/1.1 200 3164 21788 ms
2023-11-13T12:03:28.925Z INFO vco [host='vco-app-<id>' thread='http-nio-8280-exec-263' user='<username>' org='-' trace='-'] - <IP> - - [dd/mm/yyyy:hh:mm:ss +0000] GET /vco/api/server-configuration/api/category/<ID> HTTP/1.1 200 1190 21353 ms
2023-11-13T12:03:28.926Z INFO vco [host='vco-app-<id>' thread='http-nio-8280-exec-264' user='<username>' org='-' trace='-'] - <IP> - - [dd/mm/yyyy:hh:mm:ss +0000] GET /vco/api/server-configuration/api/category/<ID> HTTP/1.1 200 626 21354 ms
The code extension generates a large number of simultaneous api requests to the /vco/api/server-configuration/api/category url which can overload the Orchestator instances.
A resolution for this issue is planned for the VMware Aria Automation Orchestrator 8.16.1 release where improvements are planned to handle larger volumes of simultaneous API requests.
Workaround: Adjust startup/liveness probes to prevent the Orchestrator service from crashing during these intermittent API spikes:
1. Backup the original deployment.yaml file:
cp /opt/charts/vco/templates/deployment.yaml /home/root/deployment.yaml.old
2. Edit the deployment.yaml file:
vi /opt/charts/vco/templates/deployment.yaml
3. Increase the value for failureThreshold in each of the 3 probe sections: startupProbe, livenessProbe and readinessProbe: Set failureThreshold for livenessProbe to 10
Set failureThreshold for readinessProbe to 30
Set failureThreshold for startupProbe to 150
4. Increase timeoutSeconds
Set all timeoutSeconds values to 10
5. Save changes and edit the vi text editor:
:wq!
6. Redeploy the services for changes to take affect:
/opt/script/deploy.sh