Automation Orchestrator pods continuously restart with no obvious heap dumps or errors
search cancel

Automation Orchestrator pods continuously restart with no obvious heap dumps or errors

book

Article ID: 376909

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

  • Orchestrator pods are restarting frequently
  • There are no heap dumps created in /services-logs/prelude/vco-app/file-logs/
  • There are no OutOfMemory Exceptions in the server logs
  • Journal logs show oom-killer records
  • Garbage collector logs show allocation failures

Environment

  • VMware Aria Automation Orchestrator 8.13 and later

Cause

  •  There isn't enough non-heap memory for the garbage collector to work properly.

Resolution

There are fixes included in Orchestrator 8.18.1 Patch 2 which may help with this issue for some environments

 

Workaround

Prerequisites

Take a snapshot of your environment.

Procedure

  1. Edit the resource metrics file in your custom profile with the desired memory values.
    vi /etc/vmware-prelude/profiles/custom-profile/helm/prelude_vco/90-resources.yaml
  2. Ensure that the serverMemoryRequest is at least 50% bigger than serverJvmHeapMax and that serverMemoryLimit is at least 2G bigger than serverMemoryRequest.
    • In case serverMemoryRequest cannot be enlarged, decrease the serverJvmHeapMax to 60% of the serverMemoryLimit or less.
    • For Aria Orchestrator 8.18.1 environments with this issue, it is recommended to set serverJvmHeapMax to 40% of the serverMemoryLimit
  3. Run /opt/scripts/deploy.sh to restart the system.