Replication Assisted vMotion Migration Fails for high churn VM with Error: "Not able to get group instance post snapshot"
search cancel

Replication Assisted vMotion Migration Fails for high churn VM with Error: "Not able to get group instance post snapshot"

book

Article ID: 403533

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

When attempted to migrate a VM using RAV and encountered repeated failures with the following error: "Not able to get group instance post snapshot."

Environment

HCX 4.11

Cause

The failure occurs during the switchover phase due to the online sync process running too long (exceeding 5 hours) without completing. As per the current implementation, if the sync does not produce a group instance after 5 retry attempts, the migration is designed to fail.

app.log entries also show repeated sync conflicts:

WARN  c.v.h.s.r.j.BaseDiskReplicationJob- [migId=#####-#####-####-####-#######] Sync couldn't be started as there was an active sync already. Observed count: 5 
ERROR c.v.h.s.r.j.BaseDiskReplicationJob- [migId=#####-#####-####-####-#######] Not able to get group instance post snapshot

This issue is linked to high data churn on the VM, where data is written faster than it can be replicated within the RPO cycle.

Resolution

If zero downtime is not a strict requirement, it is recommended to use Bulk migration for high churn VMs. This approach is more tolerant of heavy write activity and does not rely on continuous replication within RPO limits.

Additional Information

  • High data churn means the VM is constantly changing data (writes/deletes/modifies), making it difficult to keep replication in sync.
  • This behavior is by design in the current product version.