vco-pods crash with out of memory errors when workflows generate a high number of audit log events in VMware Aria Automation Orchestrator
book
Article ID: 322687
calendar_today
Updated On:
Products
VMware Aria Suite
Issue/Introduction
Symptoms:
The /services-logs/prelude/vco-app/console-logs/vco-server-app.log file contains out of memory errors similar to:
java.lang.OutOfMemoryError: Java heap space Dumping heap to /usr/lib/vco/app-server/../app-server/logs/vco_Datestamp_Timestamp_heap_dump.hprof ... Heap dump file created [4568732636 bytes in 5.664 secs] Terminating due to java.lang.OutOfMemoryError: Java heap space.
When describing the pods using the kubectl -n prelude get pods command the vco pods show a high number of restarts:
When querying the vco database, the vmo_clusterauditlog table contains a large count of audit logs for individual workflow execution IDs.
Environment
VMware vRealize Orchestrator 8.x
Cause
The issue can occur when workflows generate an abnormally large amount of audit logs.
Resolution
To determine if you are hitting this particular issue you will need to query the vCO database vmo_clusterauditlog table to see if there are any workflow execution runs that have generated a large amount of audit logs.
Note: before proceeding take a snapshot of the vRO appliance(s).
SSH to the vRealize Orchestrator appliance and login as root user.
To connect to the vCO database: vracli dev psql vco-db
Type yes when prompted.
To count audit logs per workflow run execute the select query: SELECT COUNT(eventdata) AS occurrences, eventdata FROM vmo_clusterauditlog GROUP BY eventdata ORDER BY COUNT(eventdata) DESC;
If the count returned for any workflow execution id is higher than 10,000 consider removing them: DELETEFROM vmo_clusterauditlog WHERE eventdata IN ('execution-id-1', 'execution-id-2', ...);
Replace execution-id-1 & execution-id-2 with the IDs identified in step 3.