VMware Cloud Foundation (VCF) environments using HCX for workload migration may experience RPO violations and migration stalls.
This typically occurs when a single virtual machine within a shared Service Mesh generates data faster than the replication engine can transmit.
RPO violation of '####' minutes detected in HCX Cloud Manager logs.
Migration status for multiple VMs in the same group appears "Stalled" or "Slow".
Log entries in app.log: Received high data churn event with isHighChurn: true.
calculatedDataChanges significantly exceeding diskChangesToTolerate.
"Offline sync started on source VM" takes longer than expected for some VMs:
VMware HCX 4.11.3
The migration pipeline is saturated by a "noisy neighbor" VM. When a VM's data churn rate exceeds the available WAN/VPN bandwidth or the processing capacity of the HCX Interconnect (IX) appliance, it creates a backlog. Because the IX appliance is a shared resource, this backlog starves concurrent migrations of bandwidth, leading to RPO violations across the migration group.
Identify the High Churn VM: Review the HCX Cloud Manager app.log for the JSON payload containing isHighChurn: true to identify the specific vm-ID.
Isolate the Workload: Move the high-churn VM to a dedicated Service Mesh or re-schedule its migration to a window with lower network contention.
Guest OS Optimization: Inspect the Guest OS of the high-churn VM for runaway processes, such as aggressive log rotation, database maintenance jobs, or third-party backup agents that should be disabled during migration.