May 13 13:14:46 <Hostname> kernel: GC Thread#16 invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=941
May 13 13:14:46 <Hostname> kernel: Memory cgroup out of memory: Kill process 2650547 (java) score 1938 or sacrifice child
May 13 13:14:46 <Hostname> kernel: Killed process 2650547 (java) total-vm:19878224kB, anon-rss:8720004kB, file-rss:32384kB, shmem-rss:0kB
select * from vmo_workflowtokenstatistics ORDER BY tokensize DESC LIMIT 20;
there are workflow token runs that are multiple MB's in size.select * from vmo_workflowtokenstatistics s, vmo_workflowtoken e where s.tokenid=e.id order by s.tokensize desc limit 1;
returns a workflow token run with a blank tokensize value indicating an issue during compression/decompression of the content stored:Aria Automation Orchestrator 8.x
The crash can occur when there is large data stored in workflow runs which triggers a defect in the library used to compress and decompress the content resulting in a failure to calculate the statistic data for the workflow run.
In particular when workflows work with files and load strings in memory and keep large strings as variables.
The problematic library is intended to be replaced in the Aria Automation 8.18.1 release.
To prevent the issue from occurring ensure you are not storing any unnecessary data in your workflow runs. At the end of using a variable that may contain large string data, set it to empty string so it is not stored in the Orchestrator database. If possible, do not store them as variables at all, unless absolutely needed.
To clean up the largest tokens, Use this query (Only proceed after taking a snapshot):
1. SSH to the appliance and connect to the database instance
vracli dev psql
2. Connect to the vCO db:
3. Remove the largest workflow token run (You can modify the size of the query if needed)
\c vco-db
delete from vmo_workflowtoken where id in (select tokenid from vmo_workflowtokenstatistics s where (s.tokensize is null or s.tokensize >= 10000000));
delete from vmo_workflowtokencontent where workflowtokenid in (select tokenid from vmo_workflowtokenstatistics s where (s.tokensize is null or s.tokensize >= 10000000));
delete from vmo_workflowtokenstatistics where tokenid in (select tokenid from vmo_workflowtokenstatistics s where (s.tokensize is null or s.tokensize >= 10000000));
Another option to mitigate the issue is to increase the memory assigned to the affected Orchestrator instance