Troubleshooting Network File Copy(NFC) issues during clone and xvMotion
search cancel

Troubleshooting Network File Copy(NFC) issues during clone and xvMotion

book

Article ID: 324581

calendar_today

Updated On: 04-25-2025

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • Deploying a virtual machine from a template with customization may fail with error:

"Error during provisioning initial publish failed: Fault type is VC_FAULT_FATAL - Cannot install the vCenter server agent service. Cannot upload agent"

  • In the /var/log/vmware/vpxd/vpxd.log you may find entries similar to:

YYYY-MM-DDTHH:MM:SS.130+11:00 warning vpxd[06464] [Originator@6876 sub=Default opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] SSL: connect failed (5)
YYYY-MM-DDTHH:MM:SS.130+11:00 warning vpxd[06464] [Originator@6876 sub=Default opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] [NFC ERROR]NfcNewAuthdConnectionEx: Failed to connect: SSL failed to connect to peer
YYYY-MM-DDTHH:MM:SS.130+11:00 warning vpxd[06464] [Originator@6876 sub=Default opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] [NFC ERROR]NfcNewAuthdConnectionEx: Failed to connect to peer. Error: SSL failed to connect to peer
YYYY-MM-DDTHH:MM:SS.131+11:00 warning vpxd[06464] [Originator@6876 sub=Default opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] [NFC ERROR]NfcEstablishAuthCnxToServer: Failed to create new AuthD connection: SSL failed to connect to peer
YYYY-MM-DDTHH:MM:SS.131+11:00 warning vpxd[06464] [Originator@6876 sub=Default opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] [NFC ERROR]Nfc_BindAndEstablishAuthdCnx3: Failed to create a connection with server esxihostname: SSL failed to connect to peer
YYYY-MM-DDTHH:MM:SS.131+11:00 error vpxd[06464] [Originator@6876 sub=vpxNfcClient opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] Unable to connect to NFC server: SSL failed to connect to peer

YYYY-MM-DDTHH:MM:SS.131+11:00 error vpxd[06464] [Originator@6876 sub=HostAccess opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] Failed to upload files: N3Vim5Fault16HostConnectFault9ExceptionE(Fault cause: vim.fault.HostConnectFault

YYYY-MM-DDTHH:MM:SS.150+11:00 error vpxd[06464] [Originator@6876 sub=VmProv opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] Get exception while executing action vpx.vmprov.CustomizeVm: N3Vim5Fault18AgentInstallFailed9ExceptionE(Fault cause: vim.fault.AgentInstallFailed

YYYY-MM-DDTHH:MM:SS.167+11:00 error vpxd[06464] [Originator@6876 sub=vpxLro opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] [VpxLRO] Unexpected Exception: N3Vim5Fault18AgentInstallFailed9ExceptionE(Fault cause: vim.fault.AgentInstallFailed

YYYY-MM-DDTHH:MM:SS.173+11:00 info vpxd[06464] [Originator@6876 sub=Default opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] [VpxLRO] -- ERROR lro-889097 -- vm-xyz -- vim.VirtualMachine.clone: vim.fault.AgentInstallFailed:

--> (vim.fault.AgentInstallFailed)

  • Cloning a virtual machine across vCenter servers may fail with a Timeout error
  • In the /var/log/vmware/vpxd/vpxd.log you may find entries similar to:

YYYY-MM-DDTHH:MM:SS.519+05:30 info vpxd[12112] [Originator@6876 sub=vpxTaskInfo opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] Timed out waiting for task vim.Task:haTask--nfc.NfcManager.copy-2467745464

YYYY-MM-DDTHH:MM:SS.520+05:30 warning vpxd[12112] [Originator@6876 sub=vpxLro opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] [VpxLRO] Timeout waiting on updates for haTask--nfc.NfcManager.copy-2467745464

YYYY-MM-DDTHH:MM:SS.520+05:30 error vpxd[12134] [Originator@6876 sub=VmProv opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] Get exception while executing action vpx.vmprov.CopyVmFiles: N3Vim5Fault8Timedout9ExceptionE(Fault cause: vim.fault.Timedout

  • Deploying  a virtual machine from a content library template may fail with error "Cannot connect to host"
  • In the /var/log/vmware/vpxd/vpxd.log you may find entries similar to:

YYYY-MM-DDTHH:MM:SS.195+08:00 info vpxd[42107] [Originator@6876 sub=Default opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] [VpxLRO] -- ERROR task-2xxx0 -- nfcManager -- nfc.NfcManager.copy: vim.fault.HostConnectFault:
--> Result:
--> (vim.fault.HostConnectFault) {
-->    faultCause = (vmodl.MethodFault) null,
-->    faultMessage = <unset>
-->    msg = "Cannot connect to host."

  • Cross vCenter migration(xvMotion) may fail with a timeout error
  • In the /var/log/vmware/vpxd/vpxd.log you may find entries similar to:

YYYY-MM-DDTHH:MM:SS.242+13:00 error vpxd[09974] [Originator@6876 sub=VmProv opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] xVC Host Datastore Migrate failed at vpx.vmprov.CopyVmFiles for poweredOn VM 'TEST' (vm-xyz, ds:///vmfs/volumes/xxxxxxxxxxxxxxxxxxx/TEST/TEST.vmx) on host-xy (10.x.x.x) in pool resgroup-9 with ds ds:///vmfs/volumes/xxxxxxxxxxxxxxxxxxx/ to host-119 (10.x.x.x) in pool resgroup-9 with ds ds:///vmfs/volumes/xxxxxxxxxxxxxxxxxxx/ with migId 2xyxyxyxyxyxyxy0 with fault vim.fault.Timedout:

YYYY-MM-DDTHH:MM:SS.878Z error vpxd[7F33BE5EE700] [Originator@6876 sub=VmProv opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] [WorkflowImpl] Get exception while executing action vpx.vmprov.CopyVmFiles: vim.fault.Timedout

  • In the /var/run/log/hostd.log you may find entries similar to:

YYYY-MM-DDTHH:MM:SS.421Z warning hostd[2101931] [Originator@6876 sub=Libs opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] user=vpxuser:VSPHERE.LOCAL\Administrator] [NFC ERROR]NfcNewAuthdConnectionEx: Failed to connect: SSL failed to connect to peer

YYYY-MM-DDTHH:MM:SS.421Z warning hostd[2101931] [Originator@6876 sub=Libs opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] user=vpxuser:VSPHERE.LOCAL\Administrator] [NFC ERROR]NfcNewAuthdConnectionEx: Failed to connect to peer. Error: SSL failed to connect to peer

YYYY-MM-DDTHH:MM:SS.421Z warning hostd[2101931] [Originator@6876 sub=Libs opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx]user=vpxuser:VSPHERE.LOCAL\Administrator] [NFC ERROR]NfcEstablishAuthCnxToServer: Failed to create new AuthD connection: SSL failed to connect to peer

YYYY-MM-DDTHH:MM:SS.421Z warning hostd[2101931] [Originator@6876 sub=Libs opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] user=vpxuser:VSPHERE.LOCAL\Administrator] [NFC ERROR]Nfc_BindAndEstablishAuthdCnx3: Failed to create a connection with server 10.x.x.x: SSL failed to connect to peer

YYYY-MM-DDTHH:MM:SS.421Z error hostd[2101931] [Originator@6876 sub=NfcManager opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] user=vpxuser:VSPHERE.LOCAL\Administrator] Unable to connect to NFC server: SSL failed to connect to peer

YYYY-MM-DDTHH:MM:SS.422Z error hostd[2101931] [Originator@6876 sub=NfcManager opID=xxxxx-xxxx-auto-xxx-xx:xxxxx-x-xx] user=vpxuser:VSPHERE.LOCAL\Administrator] Error encountered while opening clients for copy spec:

--> N3Vim5Fault16HostConnectFault9ExceptionE(Fault cause: vim.fault.HostConnectFault

  • For all the above mentioned symptoms,  in the var/run/log/vmauthd.log of the destination ESXi host, you see entries similar to:

YYYY-MM-DDTHH:MM:SS.493Z vmauthd[2117874]: Connect from remote socket (10.x.x.x:5xxx0).
YYYY-MM-DDTHH:MM:SS.493Z vmauthd[2117874]: Connect from 10.x.x.x
YYYY-MM-DDTHH:MM:SS.790Z vmauthd[2117870]: SSL: syscall error 110: Connection timed out
YYYY-MM-DDTHH:MM:SS.790Z vmauthd[2117870]: recv() FAIL: 110.

YYYY-MM-DDTHH:MM:SS.790Z vmauthd[2117870]: VMAuthdSocketRead: read failed. Closing socket for reading.
YYYY-MM-DDTHH:MM:SS.790Z vmauthd[2117870]: Read failed.
YYYY-MM-DDTHH:MM:SS.790Z vmauthd[2117870]: VMAuthdSocketWrite: No socket.


YYYY-MM-DDTHH:MM:SS.155Z vmauthd[2311827]: Connect from remote socket (10.x.x.x:5xxx0).
YYYY-MM-DDTHH:MM:SS.155Z vmauthd[2311827]: Connect from 10.x.x.x
YYYY-MM-DDTHH:MM:SS.160Z vmauthd[2311827]: recv() FAIL: 11.
YYYY-MM-DDTHH:MM:SS.160Z vmauthd[2311827]: VMAuthdSocketRead: read failed. Closing socket for reading.
YYYY-MM-DDTHH:MM:SS.160Z vmauthd[2311827]: Read failed.
YYYY-MM-DDTHH:MM:SS.160Z vmauthd[2311827]: VMAuthdSocketWrite: No socket.


Cause

These issues can occur due to the following reasons:
  • Port 902 is not open between the source and destination ESXi hosts(firewall blocking connectivity) participating in the NFC Connection
  • MTU mismatch in the environment affecting the connectivity between the source and destination ESXi hosts
Few important points related to NFC(Network file copy):
  • NFC is used by ESXi host when data needs to be copied over the network between datastores during clone or xvMotion.
  • NFC connection is established between two ESXi hosts when the destination ESXi host does not have access to the source datastore.
  • NFC requires bidirectional connectivity between the ESXi hosts over TCP port 902.
  • If jumbo frames are configured on the ESXi hosts for management or provisioning, NFC connection uses the packet size of 8960 bytes.
  • The physical network between the ESXi hosts should support jumbo frames. Otherwise large packets(packets>1500 bytes) may get dropped resulting in NFC connection failure

Resolution

  • Identify the source and destination ESXi hosts participating in the Clone or xvMotion operation.

Note: When VM's are deployed from content library templates, vCenter selects a random ESXi host as a source ESXi host which has access to the template datastore.

  • Test the bidirectional connectivity between the ESXi hosts over port 902 using below command from a ssh session

nc -z <ESXi-IP> 902
nc -z <Destination-ESXi-VMK-IP> port -s <Source-ESXi-VMK-IP>

Output of a successful connection:
[root@esxi-1:~] nc -z 192.168.0.82 902
Connection to 192.168.0.82 902 port [tcp/authd] succeeded!


Note:  If this test fails, port 902 is not open between the ESXi hosts. Firewall could be blocking the connectivity.
 

  • If you have Jumbo Frames configured on the Management or Provisioning interfaces of the  ESXi hosts, test the bidirectional connectivity using the below command

vmkping -d -s 8972 <ESXi-IP>

In the command, the -d option sets DF (Don't Fragment) bit on the IPv4 packet. 8972 is the size needed for 9000 MTU in ESXi.

Note: If this test fails, large packets are dropped along the path between the ESXi hosts. MTU mismatch along the path could cause this issue.

Output of a successful connection:
PING server(10.0.0.1): 8972 data bytes
8980 bytes from 10.0.0.1: icmp_seq=0 ttl=64 time=10.245 ms
8980 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.935 ms
8980 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.926 ms
--- server ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.926/4.035/10.245 ms


Note: NFC connectivity will be via management vmkernel port unless there is a dedicated provisioning vmkernel interface configured on the ESXi hosts.

Additional Information

For additional information related to vmkping, refer Testing VMkernel network connectivity with the vmkping command