HCX - vMotion failure when ESXi host has multiple vmk interfaces enabled for relocation
search cancel

HCX - vMotion failure when ESXi host has multiple vmk interfaces enabled for relocation

book

Article ID: 321657

calendar_today

Updated On:

Products

VMware HCX VMware Cloud on AWS

Issue/Introduction

This document is created to help identify and resolve a known issue causing vMotion or RAV migration to fail due to vmk configuration in the source ESXi host.


HCX vMotion or RAV migration may fail with the following error:

"Number of stream connections for source and destination do not match. This is unexpected behavior, could be caused by corrupt network data packet. Failing migration."


Below exceptions can be seen during failure event:


HCX app.log

2021-11-23 13:26:33.830 UTC [VmotionService_SvcThread-1845, Ent: HybridityAdmin, , TxId: ########-####-####-####-########0394] ERROR c.v.h.s.v.j.StartSourceSideRelocateVmWorkflow- Source side relocate failed for the virtual machine. error is Migration to host <192.168.##.##> failed with error Failure (195887105).  msg.checkpoint.precopyfailure:Migration to host <192.168.##.##> failed with error Failure (195887105).  vob.vmotion.enable.features.num.stream.ip.mismatch:vMotion migration [-1062725824:222640586382610500] Number of stream connections for source and destination do not match. This is unexpected behavior, could be caused by corrupt network data packet. Failing migration.

ESX VMkernel.log

2021-11-23T13:26:23.450Z cpu44:2583593)MigrateNet: 1751: 222640586382610500 S: Successfully bound connection to vmknic vmk1 - '192.168.##.##'
2021-11-23T13:26:23.542Z cpu7:2099068)MigrateNet: vm 2099068: 3263: Accepted connection from <192.168.##.##>
2021-11-23T13:26:23.542Z cpu7:2099068)Migrate: 373: Remote machine is ESX 3.0 or newer. VMotion version 0x50003.
2021-11-23T13:26:23.547Z cpu44:2583593)WARNING: VMotion: 7176: 222640586382610500 S: Number of stream connections for source and destination do not match. This is unexpected behavior, could be caused by corrupt network data packet. Failing migration
Location of HCX app.log:
  • HCX Manager : /common/logs/admin/app.log

Location of ESXi vmkernel.log:

  • ESXi host : /var/run/log/vmkernel.log

Environment

VMware HCX

Cause

In the case where the source ESXi host has a low speed vmk interface ( 500Mbps or less ) and it has more than one vmk interface enabled for vMotion, vCenter will create 2 or more streams from different vmk interfaces to achieve an aggregated throughput of 1Gbps for data transfer in the relocation workflow.

HCX Mobility Agent (MA) only supports a single streaming connection from vCenter and it sets the numStream count to 1 in an attempt to limit those. When detecting more than one stream, the HCX MA will cancel the vMotion.

Below exception can be seen in the Mobility Agent (MA) logs when the issue occurs:
2021-11-23T13:26:24.013Z info mobilityagent[01602] [Originator@6876 sub=VmotionMessageHandler-45185:10-45] VMotionCapability remotePageFaultPages 1
--> streamThreads 2
Location of MA log:
  • HCX Manager : /tmp/<Fleet-Appliance>/<Service-Mesh_Name>/<IX_appliance_Name>/var/log/vmware/Mobilityagent.log
  • IX appliance : /var/log/vmware/Mobilityagent.log

Note: If there are multiple vmk interfaces enabled for vMotion in the ESXi host, conservatively the interface with the lowest speed will be used.

Resolution

None at this moment.

Workaround:
Disable vMotion services for all low speed vmk interfaces in the ESXi hosts that may have one at the source side.
Alternatively, select a single vmk interface for vMotion services in all ESXi hosts at the source side and set speed for the VMNICs as 1Gbps.

Additional Information

Impact/Risks:
  • This issue only affects HCX vMotion and RAV migration workflows.
  • There is no impact to other migration services like Cold or Bulk.
  • Also, there is no correlation to Network Extension services.