Automation Orchestrator pods continuously restart with no obvious heap dumps or errors
search cancel

Automation Orchestrator pods continuously restart with no obvious heap dumps or errors

book

Article ID: 376909

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

  • Orchestrator pods are restarting frequently
  • There are no heap dumps created in /services-logs/prelude/vco-app/file-logs/
  • There are no OutOfMemory Exceptions in the server logs
  • Journal logs show oom-killer records
  • Garbage collector logs show allocation failures:
    /services-logs/prelude/vco-app/file-logs/vco-server-app_gc.log
    [2025-07-16T08:36:17.407+0000][2.627s][info][gc,heap     ] GC(0) ParOldGen: 0K(1502208K)->104K(1502208K)
    [2025-07-16T08:36:17.407+0000][2.627s][info][gc,metaspace] GC(0) Metaspace: 14254K(14528K)->14254K(14528K) NonClass: 12837K(12992K)->12837K(12992K) Class: 1417K(1536K)->1417K(1536K)
    [2025-07-16T08:36:17.407+0000][2.627s][info][gc          ] GC(0) Pause Young (Allocation Failure) 550M->11M(2108M) 16.906ms
    [2025-07-16T08:36:17.407+0000][2.627s][info][gc,cpu      ] GC(0) User=0.05s Sys=0.00s Real=0.02s
    [2025-07-16T08:36:17.407+0000][2.627s][info][safepoint   ] Safepoint "ParallelGCFailedAllocation", Time since last: 746961732 ns, Reaching safepoint: 3963 ns, Cleanup: 120317 ns, At safepoint: 16980401 ns, Total: 17104681 ns
    [2025-07-16T08:36:18.602+0000][3.822s][info][gc,start    ] GC(1) Pause Young (Allocation Failure)

Environment

VMware Aria Automation Orchestrator 8.13 and later

Cause

 There isn't enough non-heap memory for the garbage collector to work properly.

Resolution

This issue is fixed in Orchestrator 8.18.1 Patch 2. Refer to VMware Aria Automation 8.18.1 Cumulative Update #2

Workaround

Prerequisites

Take a snapshot of your environment.

Procedure

  1. Edit the resource metrics file in your custom profile with the desired memory values.
    vi /etc/vmware-prelude/profiles/custom-profile/helm/prelude_vco/90-resources.yaml
  2. Ensure that the serverMemoryRequest is at least 50% bigger than serverJvmHeapMax and that serverMemoryLimit is at least 2G bigger than serverMemoryRequest.
    • In case serverMemoryRequest cannot be enlarged, decrease the serverJvmHeapMax to 60% of the serverMemoryLimit or less.
    • For Aria Orchestrator 8.18.1 environments with this issue, it is recommended to set serverJvmHeapMax to 40% of the serverMemoryLimit
  3. Run /opt/scripts/deploy.sh to restart the system.

Additional Information

For steps to scale the heap memory size of the Automation Orchestrator Server, please refer to the documentation: Link