This issue presents as vco-server-app pods continue to restart. This can be further qualified by either (or both) of these symptoms:
/services-logs/prelude/vco-app/console-logs/vco-server-app.log file contains out of memory errors similar to:java.lang.OutOfMemoryError: Java heap space
Dumping heap to /usr/lib/vco/app-server/../app-server/logs/vco_Datestamp_Timestamp_heap_dump.hprof ...
Heap dump file created [4568732636 bytes in 5.664 secs]
Terminating due to java.lang.OutOfMemoryError: Java heap space.
kubectl -n prelude get pods command the vco pods show a high number of restarts:vmo_clusterauditlog table contains a large count of audit logs for individual workflow execution IDs.The vco-app pod in a VMware Aria Automation environment utilizing the Embedded Aria Orchestrator (vRO) is consistently restarting or failing its Startup probe, leading the pod to continue to reboot and preventing new workflows from running reliably.
When running the following to review the events of the pod:
kubectl describe pod vco-app-###### -n prelude
You may see the following error in the events:
Startup probe failed: Get "http://###.###.###.###:8280/vco/api/healthstatus?startupProbe=true": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
To determine if you are hitting this particular issue you will need to query the vCO database vmo_clusterauditlog table to see if there are any workflow execution runs that have generated a large amount of audit logs.
Caution: These steps execute SQL commands directly against the internal vRO database. Always ensure a successful backup or snapshot of the Aria Automation appliance is available before proceeding.
Backup your environment:
Validate and resolve:
If you see out of memory errors but no high count returned from query 3 above then consider increasing the default Aria Orchestrator Java heap memory