Collector service on Aria Operations 8.x keeps crashing with "java.lang.OutOfMemoryError: GC overhead limit exceeded" errors and creates hprof file
search cancel

Collector service on Aria Operations 8.x keeps crashing with "java.lang.OutOfMemoryError: GC overhead limit exceeded" errors and creates hprof file

book

Article ID: 378724

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

  • Collector service keeps crashing on Aria Operations 8.x and we see multiple adapter not collecting alerts on Aria operations.
  • We could see the collector service running out of memory with hprof files

    collector-wrapper.log:
    2024/09/09 23:37:15 | INFO   | jvm 1    | java.lang.OutOfMemoryError: GC overhead limit exceeded
    2024/09/09 23:37:15 | INFO   | jvm 1    | Dumping heap to /storage/db/vcops/heapdump/java_pid2108087.hprof ...
    2024/09/09 23:38:29 | DEBUG  | wrapper  | Pending Pings 2
    2024/09/09 23:39:03 | INFO   | jvm 1    | Heap dump file created [17291222638 bytes in 108.166 secs]
    2024/09/09 23:39:03 | INFO   | jvm 1    | #
    2024/09/09 23:39:03 | INFO   | jvm 1    | # java.lang.OutOfMemoryError: GC overhead limit exceeded
    2024/09/09 23:39:03 | INFO   | jvm 1    | # -XX:OnOutOfMemoryError="/usr/lib/vmware-vcops/install/oom-handler.sh %p"
    2024/09/09 23:39:03 | INFO   | jvm 1    | #   Executing /bin/sh -c "/usr/lib/vmware-vcops/install/oom-handler.sh 2108087"...
    2024/09/09 23:39:04 | DEBUG  | wrapper  | Signal trapped.  Details:
    2024/09/09 23:39:04 | DEBUG  | wrapper  |   signal number=17 (SIGCHLD), source="unknown"
    2024/09/09 23:39:04 | DEBUG  | wrapper  | Received SIGCHLD, checking JVM process status.
    2024/09/09 23:39:04 | STATUS | wrapper  | JVM received a signal SIGKILL (9).
    2024/09/09 23:39:04 | STATUS | wrapper  | JVM process is gone.
    2024/09/09 23:39:04 | DEBUG  | wrapper  | JVM process exited with a code of 1, setting the wrapper exit code to 1.
    2024/09/09 23:39:04 | ERROR  | wrapper  | JVM exited unexpectedly.

  • The problematic adapter here is vCloud Director.
  • We could see below information from the vCloud Director adapter logs

    2024-09-18T12:42:13,557+0000 ERROR [Collector worker thread 2] (8083) com.integrien.adapter3.vcloud.VCloudAdapter.collectResources - null
    java.util.concurrent.TimeoutException: null
        at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:?]
        at com.integrien.adapter3.vcloud.VCloudAdapter.collectResources(VCloudAdapter.java:890) ~[vcloud_adapter3.jar:?]
        at com.integrien.adapter3.vcloud.VCloudAdapter.onCollect(VCloudAdapter.java:1699) ~[vcloud_adapter3.jar:?]
        at com.integrien.alive.common.adapter3.AdapterBase.collectBase(AdapterBase.java:767) ~[vrops-adapters-sdk.jar:?]
        at com.integrien.alive.common.adapter3.AdapterBase.collect(AdapterBase.java:553) ~[vrops-adapters-sdk.jar:?]
        at com.integrien.alive.collector.CollectorWorkItem3.run(CollectorWorkItem3.java:47) ~[vcops-collector-1.0-SNAPSHOT.jar:?]
        at com.integrien.alive.common.util.ThreadPool$WorkerItem.run(ThreadPool.java:275) ~[vrops-adapters-sdk.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
        at java.lang.Thread.run(Unknown Source) ~[?:?] 

Environment

Aria Operations 8.x

Cause

This is a known issue impacting vCloud Director Management Pack 8.14.0.22780045 GA

Resolution

Upgrade vCloud Director Management Pack to 8.16 , download from this link