HCX Bulk or RAV migration tasks fail during the vMotion phase. The migration halts, and the system reports that the source side relocate failed.
The HCX Manager UI displays the following errors:
vMotion failed. System Error. Source side error is : Source side relocate failed for the virtual machine. A fatal internal error occurred.msg.svmotion.disk.copyphase.failed: Failed to copy one or more disks.vob.vmotion.stream.check.block.mem.timed.out: VMotionStream timed out while waiting for disk's queue count to drop below the maximum limit.
ESXi vmkernel.log shows zero or low throughput metrics:
XVMotion: 3064: timed out while waiting for disk 0's queue count to drop below the maximum limit of 32768 blocks.VMotion bandwidth in last 1s: 0 bytes/s
VMware HCX
VMware NSX
The failure is caused by asymmetric routing combined with a stateful firewall on NSX Tier-0 (T0) Gateways operating in Active/Active High Availability (HA) mode. In an Active/Active topology, egress and ingress traffic may traverse different T0 nodes. Because a stateful firewall expects bi-directional traffic to pass through the same node, it drops return packets that arrive at a node where no connection state exists. This results in the HCX vMotion stream timing out.
To confirm that the firewall is dropping packets before applying the fix, run the following command on the NSX Edge CLI:
get logical-router interface <IX_Interface_UUID> statsIf the Firewall counter under RX-Drops is incrementing while a migration is active, it confirms that the stateful firewall is dropping the asymmetric return packets Interpreting NSX Edge Interface stats.