Cross-Host Cluster/Datcenter Storage vMotion Stalls at 22 Percent Due to Management Subnet Mismatch
search cancel

Cross-Host Cluster/Datcenter Storage vMotion Stalls at 22 Percent Due to Management Subnet Mismatch

book

Article ID: 440845

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • VMware vSphere vMotion (Cross-Host Storage vMotion) of a powered-on virtual machine stalls and stops progressing at 22%.
  • The VMkernel adapter designated for the migration is explicitly configured to use the dedicated vMotion TCP/IP stack.
  • The /var/log/hostd.log on the source ESXi host indicates a Network File Copy (NFC) operation is invoked but fails to transfer the disk payload:
    [Originator@6876 sub=Vimsvc.TaskManager opID=<######> sid=<######> user=vpxuser:<######>] Task Created : haTask -- nfc.NfcManager.copy-<######>
    [Originator@6876 sub=NfcManager opID=<######> sid=<######> user=vpxuser:<######>] Copy operation invoked
  • The /var/log/vpxd.log on the vCenter also indicates issue with Network File Cop (NFC) operation failing due to timeout.
    YYYY-MM-DDTHH:MM:SS:sssZ vpxd[1221174] [Originator@6876 sub=vpxTaskInfo opID=mn####yd-1143550-auto-oidb-h5:70117241-81-01] Timed out waiting for task vim.Task:haTask--nfc.NfcManager.copy-65364400
    YYYY-MM-DDTHH:MM:SS:sssZ warning vpxd[1221174] [Originator@6876 sub=vpxLro opID=mn####yd-1143550-auto-oidb-h5:70117241-81-01] [VpxLRO] Timeout waiting on updates for haTask--nfc.NfcManager.copy-65364400
    YYYY-MM-DDTHH:MM:SS:sssZ warning vpxd[1221174] [Originator@6876 sub=vpxLro opID=mn####yd-1143550-auto-oidb-h5:70117241-81-01] [VpxLRO] Timeout waiting on updates for haTask--nfc.NfcManager.copy-65364400
    YYYY-MM-DDTHH:MM:SS:sssZ error vpxd[06546] [Originator@6876 sub=VmProv opID=mn####yd-1143550-auto-oidb-h5:70117241-81-01] Failed to track task vim.Task:haTask--nfc.NfcManager.copy-65364400 on host vim.HostSystem:host-61: Fault cause: vim.fault.Timedout
    -->
    --> backtrace:
    --> [backtrace begin] product: VMware VirtualCenter, version: 8.0.3, build: build-24853646, tag: vpxd, cpu: x86_64, os: linux, buildType: release
    --> backtrace[00] libvmacore.so[0x00531C43]
    .....
    YYYY-MM-DDTHH:MM:SS:sssZ error vpxd[06546] [Originator@6876 sub=VmProv opID=mn####yd-1143550-auto-oidb-h5:70117241-81-01] Aborting task tracking since task vim.Task:haTask--nfc.NfcManager.copy-65364400 failed
    YYYY-MM-DDTHH:MM:SS:sssZ error vpxd[06546] [Originator@6876 sub=VmProv opID=mn####yd-1143550-auto-oidb-h5:70117241-81-01] Get exception while executing action vpx.vmprov.CopyVmFiles:
    --> (vim.fault.Timedout) {
    -->    msg = "",
    --> }
    .....
    YYYY-MM-DDTHH:MM:SS:sssZ error vpxd[06546] [Originator@6876 sub=VmProv opID=mn####yd-1143550-auto-oidb-h5:70117241-81-01] Local-VC Host Datastore Migrate failed at vpx.vmprov.CopyVmFiles for poweredOn VM 'VM_NAME' (vm-####, ds:///vmfs/volumes/66######-########-####-##########00/VM_NAME/VM_NAME.vmx) on host-## (10.###.###.###) in pool resgroup-#### with ds ds:///vmfs/volumes/66######-########-####-##########00/ to host-##### (10.###.###.###) in pool resgroup-##### with ds ds:///vmfs/volumes/67######-########-####-##########00/ with migId 24###############81 with fault vim.fault.Timedout:  as Operation: Local-VC_NonDRS_ComputeandStoragevMotion
    YYYY-MM-DDTHH:MM:SS:sssZ info vpxd[06546] [Originator@6876 sub=VdbOpJournal opID=mn####yd-1143550-auto-oidb-h5:70117241-81-01] Removed journal id=14616

Environment

VMware vSphere ESXi
VMware vCenter Server

Cause

During a Cross-Host Storage vMotion, disk data transfer relies on the Network File Copy (NFC) protocol over TCP port 902. Because NFC traffic is architecturally restricted from routing over the dedicated vMotion TCP/IP stack, the ESXi host defaults to routing the port 902 disk payload over the Management network.
The migration stalls at 22% because a subnet mask mismatch on the Management network (e.g., /23 on the source host and /24 on the destination host) disrupts bidirectional Layer 2 communication, rendering the required TCP port 902 communication over the Management network unreachable.

Resolution

To resolve this issue, correct the underlying network configuration or isolate the NFC traffic:

  • Address the Subnet Mismatch (Management Network): Standardize the subnet mask on the Management VMkernel adapters across all participating ESXi hosts (e.g., ensure both source and destination are correctly configured to /24) to restore proper routing and Layer 2 communication for the fallback NFC traffic.
  • Deploy the Provisioning TCP/IP Stack (Recommended Architecture):
  • Leave the existing VMkernel adapter configured on the vMotion TCP/IP stack.
  • Create an additional VMkernel adapter on both the source and destination ESXi hosts.
  • Assign this new adapter exclusively to the Provisioning TCP/IP stack. This provides a dedicated, properly routed path for NFC traffic that inherently bypasses the Management network.

Additional Information

Cross vCenter Server and Cross Cluster vMotion fails with different TCP/IP Stacks
Unable to migrate powered off VM or template