RAV Migration Failure: vMotion Errors When Switching Over Multiple Virtual Machines
search cancel

RAV Migration Failure: vMotion Errors When Switching Over Multiple Virtual Machines

book

Article ID: 411826

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

  • When performing RAV Migration of 20 VMs or more - vMotion failure may occur on switchover.

    vMotion failed. System Error. Source side error is : Source side relocate failed for the virtual machine. A specified parameter was not correct: Target side error is :
    Could not complete network copy for file [ma-ds-########-########-####-############] ###############/###############.nvram


  • The following error is observed when reviewing the migration in HCX Manager UI:



  • Source IX appliance </var/log/messages> show the following messages indicating extremely poor throughput via replication workstream:

    <132>1 <timestamps> +00:00 <Service-Mesh_name>-IX-I1 lwdproxy 887 - - connection 410 client errored after 3595807 bytes (~ 0.417 Mbps).  err: readfrom tcp <tunnel IP>:49866->tunnel IP>:31031: read tcp <MA_replication_IP>:31031-><Source_ESX_Replication_IP>:61052: read: connection reset by peer
    <133>1 <timestamps> +00:00 <Service-Mesh_name>-IX-I1 lwdproxy 887 - - connection 410 server finished after 15276 bytes (~ 0.002 Mbps)
    <133>1 <timestamps> +00:00 <Service-Mesh_name>-IX-I1 lwdproxy 887 - - connection 410 end

    NOTE: The "connection reset by peer" message is not indicating an issue. This is simply saying this connection is being tore down gracefully.

  • Deploying UPSA appliances on each respective HCX Uplink network and running iperf3 using the command <iperf3 -c <target_IP> -t 10 -i .5 -P 5 -V -p 4500> shows the below results:



  • Target ESXi host's logs vmkernel.log (/var/run/log/) shows the network bandwidth is lower than the required for vMotion ie 150 Mbps.

    2026-02-23T03:56:42.033Z Wa(180) vmkwarning: cpu47:1390####)WARNING: VMotionRecv: 4256: 25376273######## D: the remote host closed the connection unexpectedly and migration has stopped. The closed connection probably results from a migration failure detected on$
    2026-02-23T03:56:42.033Z In(182) vmkernel: cpu47:1390####)Migrate: 101: 5376273######## D: MigrateState: Failed
    2026-02-23T03:56:42.033Z Wa(180) vmkwarning: cpu47:1390####)WARNING: Migrate: 257: 5376273######## D: Failed: Connection closed by remote host, possibly due to timeout (0xbad003f) @0x42001dedc88f
    2026-02-23T03:56:42.033Z In(182) vmkernel: cpu47:1390####)VMotion: 8057: 5376273######## D: Estimated network bandwidth 4.279 MB/s before failure

  • Target IX-R Appliance's app.log at /common/logs/admin reports  a SystemError (195887167) stating Failed waiting for data. The appliance confirmed the remote host (Source) closed the connection unexpectedly.

    2026-02-23 03:57:11.396 UTC [VmotionService_SvcThread-106523, Ent: DEFAULT, , TxId: d20396d5-e572-######-#########] ERROR c.v.h.s.v.j.MonitorTargetSideProgressWorkflow- [migId=######-####-###-####-a580a#####] Target side relocate 'task-424185' failed for the virtual machine. Error is A general system error occurred: vMotion failed: unknown error msg.migrate.waitdata.platform:Failed waiting for data. Error 195887167. Connection closed by remote host, possibly due to timeout.  vob.vmotion.recv.connection.closed:vMotion migration [-14082####:253762734########] the remote host closed the connection unexpectedly and migration has stopped. The closed connection probably results from a migration failure detected on the remote host. faultTime:2026-02-23T03:56:42.191892Z. Total progress % is null.


Environment

VMware HCX

Cause

The current data suggests intermittent issues within the underlay connectivity.. iperf3 tests using five parallel streams show inconsistent throughput across the path, whereas a stable environment would show balanced performance. These fluctuations are critical as the environment is failing to meet the 150 Mbps minimum throughput requirement for vMotion.

Resolution

The organization's network team should investigate the current throughput/bandwidth connectivity issues. To assist with the diagnostic process, an MTR (My traceroute) should be performed between the UPSA appliances to identify the active network path and pinpoint any packet loss or latency at specific hops. HCX RAV requires 150 Mbps or higher throughput capability. See Network Underlay Minimum Requirements.

    Additional Information