Slow migration performance and migrations stuck during the "Offline Sync"
search cancel

Slow migration performance and migrations stuck during the "Offline Sync"

book

Article ID: 440867

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

  • HCX Bulk migrations become stuck during the switchover phase, specifically in the "offline sync" state. In some cases, the migrations may eventually complete after several hours.
  • The "Offline Sync" issue occurs more frequently when you schedule the switchover for a future date rather than performing it immediately.
  • Replication Assisted vMotion (RAV) migrations using "Online sync" also experience significantly slower performance during switchover phase.
  • The following events are observed in the IX-R1 /var/log/messages.log, showing extremely low average transfer speeds (e.g., ~ 0.000 Mbps or ~ 0.199 Mbps)
    lwdproxy 911 - - connection 3 server finished after 4528 bytes (~ 0.000 Mbps)

    lwdproxy 911 - - connection 4 server finished after 3248 bytes (~ 0.000 Mbps)
  • HCX IX appliance /var/log/vmware/hbrsrv.log logs indicate zero throughput and connection resets:
    info hbrsrv[01484] [Originator@6876 sub=StatsLog groupID=VRID-########-####-####-####-############ opID=hsl-########] HbrEvent: {"eventID":"deltaSyncComplete","groupID":"VRID-########-####-####-####-############","diskID":"RDID-########-####-####-####-############","quiesceType":0,"bytesTransferred":139264,"duration":497,"throughput":0,"serverID":"########-####-####-####-############","hbrEvent":1}

    info hbrsrv[01456] [Originator@6876 sub=StatsLog] HbrEvent: {"eventID":"lwdConnectionReset","groupID":"VRID-########-####-####-####-############","clientAddress":"[localhost]:44486","serverID":"########-####-####-####-############","hbrEvent":1}

Resolution

This is a condition that may occur in a VMware HCX environment if there is performance issues in the physical network. 

Please investigate with network team to identify and resolve the network performance issue. 

Additional Information

Troubleshooting “No Connection to VR Server: Unknown” Errors During Bulk Migrations with VMware HCX

HCX Bulk Migration operations and best practices

Similar situation has been seen in Design considerations for Azure VMware Solution Generation 2 Private Clouds:

"HCX RAV and Bulk migrations on Gen 2 can experience slower performance due to stalls during Base Sync and Online Sync phases. Customers should plan for longer migration windows and schedule waves accordingly for now. For suitable workloads, vMotion offers a faster, low‑overhead option when host and network conditions allow."