VCF Operations for Networks OVF Deployment Fails at 98% Due to NFS Server Application-Level Latency
search cancel

VCF Operations for Networks OVF Deployment Fails at 98% Due to NFS Server Application-Level Latency

book

Article ID: 440349

calendar_today

Updated On:

Products

VCF Operations for Networks

Issue/Introduction

Symptoms

  • User failed to deploy VCF Operations for Networks from VCF Fleet and there is error "IO Exception occurred while performing the operation"
  • Fleet log file /var/log/vrlcm/vmware_vrlcm.log shows the OVF deployment stuck in 98%

YYYY-MM-DDTHH:MM:SS INFO vrlcm[1290] [Thread-xxxxxx] [c.v.v.l.d.v.d.i.OvfDeployLocal] -- completed percent uploaded-------------------------------------------->98
YYYY-MM-DDTHH:MM:SS INFO vrlcm[1290] [Thread-xxxxxx] [c.v.v.l.d.v.d.i.OvfDeployLocal] -- completed percent uploaded-------------------------------------------->98
YYYY-MM-DDTHH:MM:SS INFO vrlcm[1290] [Thread-xxxxxx] [c.v.v.l.d.v.d.i.OvfDeployLocal] -- completed percent uploaded-------------------------------------------->98

  • vCenter log file vpxd.log shows the task stuck in 98%

YYYY-MM-DDTHH:MM:SS info vpxd[2549699] [Originator@6876 sub=VAppImport opID=xxxxxx] Import task progress: 98
YYYY-MM-DDTHH:MM:SS error vpxd[2549699] [Originator@6876 sub=VAppImport opID=xxxxxx] Caught exception while importing VM: N5Vmomi5Fault15RequestCanceled9ExceptionE(Fault cause: vmodl.fault.RequestCanceled
YYYY-MM-DDTHH:MM:SS info vpxd[2549699] [Originator@6876 sub=VAppImport opID=xxxxxx] Removing VM [vim.VirtualMachine:vm-xx,operations-networks-platform] due to failed import

 

  • ESXi log file /var/log/vpxa.log shows I/O error towards VCF Operations for Networks vmdk file which is in external NFS server

YYYY-MM-DDTHH:MM:SS Wa(164) Vpxa[13342005]: [Originator@6876 sub=DiskLib opID=SWI-67bb6058] DISKLIB-LINK  : DiskLinkClose: Failed to close '/vmfs/volumes/xxxxxx-xxxxxx/<operations-networks>/<operations-networks-platform>.vmdk': Input/output error

  

  • Captured packets show the external NFS server failing to reply the NFS V3 GETATTR Call for over 30 seconds, and ESXi report All Paths Down (APD) events for the destination NFS datastore.

YYYY-MM-DDTHH:MM:SS In(182) vmkernel: StorageApdHandler: 1193: APD start for 0xXXXXXX

YYYY-MM-DDTHH:MM:SS In(182) vmkernel: cpu6:2097645)StorageApdHandlerEv: 106: Device or filesystem with identifier [xxxxxx-xxxxxx] has entered the All Paths Down state.

Environment

VMware VCF Operations for Networks 9

Cause

The NFS storage server fails to respond to RPC calls (such as GETATTR) within the expected time frame (here the NFS server responds after 30 seconds), leading the ESXi host to enter an APD state. This disrupts active file writes, causing the OVF import to fail when attempting to close or sync VMDK files.

Resolution

  1. Engage storage vendor to investigate server-side performance, controller health, or I/O spikes.
  2. Ensure the NFS server can handle the sustained I/O load required for large OVF deployments.