VMware HCX Bulk Migration Hangs at Offline Sync Due to Manual Power Off
search cancel

VMware HCX Bulk Migration Hangs at Offline Sync Due to Manual Power Off

book

Article ID: 438977

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

VMware HCX (HCX) Bulk Migration fails to complete the switchover phase and hangs during the offline synchronization process.

Symptoms and logs observed:

  • The switchover task stalls at the START_OFFLINE_SYNC state.

  • Host-Based Replication (Hbrsvc) reports lwd failed and removes the replication group.

app.log (HCX Manager):

INFO  c.v.h.s.r.j.ReplicationSwitchoverJob- Switchover running in state START_OFFLINE_SYNC

hostd.log (Source ESXi):

[Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/########/######/VM_Name.vmx] State Transition (VM_STATE_ON_SHUTTING_DOWN -> VM_STATE_OFF)
[Originator@6876 sub=Hbrsvc] Cleaning up ReplicationGroup (groupID=VRID-######) (state=inactive) (deltaState=lwd failed)
[Originator@6876 sub=Hbrsvc] ReplicationScheduler: removing group (groupID=VRID-#######)

vpxd.log (vCenter Server):

Power off was done by automation tool few hours before switchover

vpxd[######] [vim.event.VmGuestShutdownEvent] [info] [<Automation-Tool-Name>] [Guest OS shut down for VM_Name on Hostname in Cluster_name]

vpxd[####]: [vim.event.VmPoweredOffEvent] [info] [VM_Name on Hostname in Cluster_name is powered off]

Environment

VMware HCX 4.11.3

Cause

The source Virtual Machine was manually powered off prior to the HCX-orchestrated switchover phase. HCX Bulk Migration requires the source VM to be powered on so it can coordinate a graceful guest-level shutdown via VMware Tools. Manually powering off the VM bypasses this orchestration, causing the Host-Based Replication (HBR) service to fail the final delta sync checkpoint, resulting in a stalled migration.

Resolution

  1. Ensure that automated scripts or administrators do not manually power off Virtual Machines actively undergoing an HCX Bulk Migration.

  2. Allow VMware HCX to handle the power-off operation during the scheduled switchover window.

  3. If application quiescing is required prior to the cutover window, stop the guest OS services internally, but leave the Virtual Machine powered on in vCenter Server.

  4. If a migration is stalled due to this issue, cancel the current migration task, ensure the Virtual Machine is powered on, and trigger a new switchover.

 

Additional Information

HCX Bulk Migration Operations and Best Practices