HCX Bulk or RAV Migration Fails due to Memory Error
search cancel

HCX Bulk or RAV Migration Fails due to Memory Error

book

Article ID: 316646

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

  • HCX Version in use is 4.0.0.
  • Bulk or RAV Migration fails due to a memory related error during the initial sync phase.  The error in the UI can either read 'GC overhead limit exceeded' or 'Java Heap Space'.

Example:


- An error can be viewed in the /common/logs/admin/app.log on the source HCX Manager (in respect to the direction of the migration) that reads:

2021-05-26 07:06:11.961 UTC [ReplicationTransferService_SvcThread-288588, Ent: HybridityAdmin, , TxId:#####################################] ERROR c.v.h.s.r.j.ReplicationTransferMonitor- Job (#####################################) failed with exception.
java.lang.OutOfMemoryError: GC overhead limit exceeded

OR

2021-05-26 06:53:02.530 UTC [ReplicationTransferService_SvcThread-288570, Ent: HybridityAdmin, , TxId: fd9faeb5-b50b-4dae-b319-3587d821834a] ERROR c.v.h.s.r.j.ReplicationTransferMonitor- Job (#####################################) failed with exception.
java.lang.OutOfMemoryError: Java heap space

 

Cause

There is a minor memory leak effecting HCX 4.0.0 that can cause migrations to fail during the initial sync phase of a large wave of Bulk or RAV Migrations.

Resolution

Upgrade the environment to at least HCX 4.0.1 or onwards.  This symptom is permanently FixedInVersion#: 4.0.1 and all subsequent releases afterward.

Workaround:
If upgrade is not possible, a reboot of the HCX Manager and re-initiation of the migrations in smaller groups will prevent recurrence for a period of time (relative to an environment's workload).