HCX Bulk migration task: "Offline sync started on source VM" is taking longer than expected for some VMs
book
Article ID: 390725
calendar_today
Updated On:
Products
VMware Cloud on AWSVMware HCX
Issue/Introduction
During the migration switchover, the source VM is powered off, and the task 'Offline sync started on source VM' is taking several hours for some VMs.
The migration events can be found in the HCX Manager UI under 'Migration' tab, by expanding the virtual machine, you can view the events:
The affected VMs typically run services that generate significant data churn, such as SQL Server.
While the offline sync is ongoing, no progress is displayed in the HCX Manager UI, giving the impression that the migration is stuck.
Cause
Replication begins an initial full synchronization transfer to the remote site.
The switchover can start immediately after the initial synchronization is completed, or it can be delayed until a specific time using the scheduled migration option.
By using the scheduled migration option, the switchover can occur during a maintenance window.
A delta synchronization with two-hour recovery point objective (RPO) occurs while waiting for the scheduled switchover, after the initial synchronization is completed.
During switchover, the source VM is powered off to perform a final off-line synchronization, data consolidation, and VM instantiation at the target data center.
For virtual machines, such as SQL Server with high data churn, depending on the size of the delta file, this process can take from minutes to several hours.
In additional, the replica instance vmdk files are consolidated(deleted) which is a time taking process and depends upon the target vCenter Server infrastructure which can't be predicted and mostly unrelated to HCX bulk migration workflow.
Resolution
While the task may appear stuck at 'Offline sync started on source VM' the process is typically running in the background. However, the HCX Manager UI does not display progress. There is a planned enhancement for the HCX Manager UI to display the progress of the offline sync in future releases.
Manual process to review the delta sync progress:
Note: If you don't have SSH Access to ESXi host, you can use the vCenter MOB to get similar information. This is required for Hyperscaler customers when SSH access is not allowed.
From vCenter, identify the specific ESXi host where the source VM is running.
SSH to the ESXi host where the source VM is running, and run the command: vim-cmd vmsvc/getallvms | grep "<vm-name>"
From the output you retrieved in the previous step, note the VMID and run the following command: vim-cmd hbrsvc/vmreplica.getState <VM ID>
The output will be similar to:
If the output indicates "lwd delta" then delta sync is currently in progress.
To monitor progress, run the following command to repeat the check every 60 seconds: watch -n 60 vim-cmd hbrsvc/vmreplica.getState <VM ID>
By comparing the outputs, you can estimate how long it will take for the delta sync to complete.
Once the delta sync is completed, a consolidation process will occur on the target datastore. This step can also be time-consuming, depending on the capabilities of the target datastore (e.g., whether it uses flash storage or not).
Manual process to review the delta sync progress using vCenter MOB:
From vCenter UI, click on the VM that is powered off during the offline sync phase.
From the browser URL, you can obtain the MOID for the VM, for example: https://<vCenter-FQDN>/ui/app/vm;nav=h/urn:vmomi:VirtualMachine:vm-123456:########-####-####-####-###########/summary The MOID in this example is: vm-123456
Next, go to https://<vCenter-FQDN>/mob/?moid=hbrManager&method=queryReplicationState and log in using the same vCenter credentials. Note: for VMC on AWS, the user should be [email protected].
Replace the MOID with the VM MOID that was obtained in Step 2, and click: "Invoke Method":
To keep monitoring the status of the offline sync, you can repeat the same steps.