VM Deployment Fails with "Unable to Write VMX File" on NFS Datastore Due to APD and MTU Mismatch
search cancel

VM Deployment Fails with "Unable to Write VMX File" on NFS Datastore Due to APD and MTU Mismatch

book

Article ID: 394147

calendar_today

Updated On: 06-23-2025

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • While deploying Virtual Machine on NFS datastore getting error - 

Unable to write VMX file: /vmfs/volumes/<NFS Datastore>/TEST/TEST.vmx. An error occurred while syncing configuration file "/vmfs/volumes/<NFS Datastore>/TEST/TEST.vmx~": 5 (Input/output error).

Validation:

  • The ESXi host experienced an All Paths Down (APD) condition, leading to the disconnection of the NFS datastore from the storage array. This was observed in the /var/log/vmkernel.log & /var/log/vobd.log file, where a series of events indicated the loss and restoration of connection to the NFS server. The logs highlight the issue with NFS communication between the ESXi host and the NFS server, resulting in potential data loss due to incomplete file operations.

/var/log/vmkernel.log

2025-04-07T08:33:18.447Z cpu2:2097622)StorageApdHandlerEv: 110: Device or filesystem with identifier [<NFS UUID>] has entered the All Paths Down state.
2025-04-07T08:33:25.448Z cpu7:2100001 opID=5b442ffb)World: 12077: VC opID sps-Main-545873-768-14794-71-4a-99a4 maps to vmkernel opID 5b442ffb
2025-04-07T08:33:25.448Z cpu7:2100001 opID=5b442ffb)SunRPC: 3291: Synchronous RPC cancel for client 0x430f6f401600 IP <NFS Target IP>.8.1 proc 3 xid 0x14cda8e8 attempt 1 of 3
2025-04-07T08:34:42.448Z cpu6:2098511)WARNING: NFS: 338: Lost connection to the server <server name> mount point <nfs mount point>, mounted as <NFS UUID> ("<NFS Datastore Name>")
2025-04-09T04:46:30.510Z cpu48:2102161 opID=85a04160)ALERT: BC: 3177: File TEST.vmx~ closed with dirty buffers. Possible data loss.
2025-04-09T04:46:33.937Z cpu48:2098513)NFS: 347: Restored connection to the server <server name> mount point <NFS Mount point>, mounted as <NFS UUID> ("<NFS Datastore Name>")

/var/log/vobd.log

2025-04-09T04:43:39.376Z: [APDCorrelator] 5521000780us: [vob.storage.apd.start] Device or filesystem with identifier [<NFS UUID>] has entered the All Paths Down state.
2025-04-09T04:43:39.376Z: [APDCorrelator] 5521103372us: [esx.problem.storage.apd.start] Device or filesystem with identifier [<NFS UUID>] has entered the All Paths Down state.
2025-04-09T04:45:03.379Z: [vmfsCorrelator] 5605001637us: [vob.vmfs.nfs.server.disconnect] Lost connection to the server <server name> mount point <NFS Mount point>, mounted as <NFS UUID> ("<NFS Datastore Name>")
2025-04-09T04:45:03.379Z: [vmfsCorrelator] 5605105979us: [esx.problem.vmfs.nfs.server.disconnect] <server name> <NFS Mount point> <NFS UUID> <NFS Datastore Name>
2025-04-09T04:45:59.380Z: [APDCorrelator] 5661001644us: [vob.storage.apd.timeout] Device or filesystem with identifier [<NFS UUID>] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.

Environment

VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x

Cause

This issue occurs due to a mismatch in MTU (Maximum Transmission Unit) settings along the network path used by the ESXi host to reach the NFS storage. If the path includes components configured with conflicting MTU sizes (e.g., 1500 vs. 9000), it can result in packet loss or dropped connections, leading to APD (All Paths Down) events and failure to write VM configuration files (.vmx)

Resolution

  1. Validate Jumbo Frame Connectivity

    Run the following command from the affected ESXi host:

    vmkping -I vmkX -s 8972 -d <NFS_Target_IP> -c 30

    • Replace vmkX with the VMkernel interface used for NFS traffic.

    • If the test results in 100% packet loss, it confirms an MTU mismatch.

  2. Remediate MTU Configuration

    Choose one of the following based on the environment's capability and design:

    • Option 1: Lower the MTU on the ESXi host's VMkernel and associated vSwitch to 1500 bytes.

    • Option 2: Ensure jumbo frames (MTU 9000) are consistently configured end-to-end, including:

      • ESXi VMkernel interfaces

      • vSwitch/Distributed Switch

      • Physical network switches

      • Storage target interfaces