Cannot power on a virtual machine located in an NFS datastore
search cancel

Cannot power on a virtual machine located in an NFS datastore

book

Article ID: 313027

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

You may see either one of the below scenario:
 
Scenario 1:
  • You are unable to power on a virtual machine located in an NFS datastore.
  • Powering on a virtual machine that is located in a NFS datastore fails.
  • You see an error similar to:

    Failed to lock the file. Cannot open the disk '/vmfs/volumes/UUID/VM_Name/VM_Name.vmdk' or one of the snapshot disks it depends on.
  • The /vmfs/volumes/datastore_name/virtual_machine_folder/vmware.log file, contains entries similar to:

    vmx| DISKLIB-VMFS : "/vmfs/volumes/UUID/VM_Name/VM_Name-flat.vmdk" : failed to open (Failed to lock the file): AIOMgr_Open failed. Type 3
    vmx| DISK: Cannot open disk "/vmfs/volumes/UUID/VM_Name/VM_Name.vmdk": Failed to lock the file (16392).
    vmx| Msg_Post: Error
    vmx| [msg.disk.noBackEnd] Cannot open the disk '/vmfs/volumes/UUID/VM_Name/VM_Name.vmdk' or one of the snapshot disks it depends on.
     
  • The /var/log/vmkernel file contains entries similar to:

    NFSLock: 2819: failed to get lock on file VM_Name.vmdk 0x000000000000 on ip> (ip): Busy
    WARNING: NFSLock: 2128: disk is being locked by other consumer
    NFSLock: 2819: failed to get lock on file <VM_Name>-flat.vmdk 0x000000000000 on ip (ip): Busy

Scenario 2:
 
  • You are unable to power on a virtual machine located in an NFS datastore.
  • The impacted virtual machine is currently in a Powered Off state in vCenter.
  • The virtual machine is able to be successfully vMotioned across ESXi hosts within vCenter.
  • Powering on a virtual machine that is located in a NFS datastore fails with below error pointing to the actual ESXi hostname that is locking the VM:
 
  • The /var/log/vmkernel.log of the locking ESXi host shows below error:

/var/log/vmkernel.log:

YYYY-MM-DDTHH:MM:SS.Z cpu43:2186272)WARNING: NFSLock: 4084: File is being locked by a consumer on host <esxi-hostname> with exclusive lock.
YYYY-MM-DDTHH:MM:SS.Z cpu43:2186272)NFSLock: 5032: failed to get lock on file vm-name-flat.vmdk(.lck-0############0) 0x############ on <NFS-vol-name>  (esxi-ip): Busy
YYYY-MM-DDTHH:MM:SS.Z cpu12:2098728)WARNING: NFS: 5072: NFS volume <NFS-vol-name> average I/O latency 10122(us) has exceeded threshold 10000(us) for last 10 minutes
YYYY-MM-DDTHH:MM:SS.Z cpu16:2186272)WARNING: NFSLock: 4084: File is being locked by a consumer on host <esxi-hostname> with exclusive lock.
YYYY-MM-DDTHH:MM:SS.Z cpu16:2186272)NFSLock: 5032: failed to get lock on file vm-name-flat.vmdk(.lck-0############0) 0x############ on <NFS-vol-name>  (esxi-ip): Busy
 

Environment

VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x

Cause

This issue occurs when the .lck-xxx file located in the virtual machine folder holds a lock on the vmdk file. When an ESXi/ESX host accesses a virtual machine disk file on a NFS based datastore a .lck-xxx lock file is generated in the same directory as the disk file. This file prevents other ESXi/ESX hosts from accessing the virtual disk file.

Resolution

Scenario 1:
 
Check the file NFS file locks using the Understanding the NFS .lck lock file to understand the ESX host and NFS filename it refers to KB to identify the file locks.
 
To resolve this issue, delete the .lck-xxxx file located in the virtual machine folder.

Caution: Ensure the vmdk files are not in use by another virtual machine or backup process before performing these steps. Removing a valid .lck can cause instability.

To delete the .lck-xxxx file:

  1. Ensure the virtual machine is powered off.
  2. Connect to the ESXi/ESX host using Secure Shell (SSH). 
  3. Go to the virtual machine folder, run the command:

    cd /vmfs/volumes/datastore_name/virtual_machine_folder
     
  4. List all files including the hidden files, run the command:

    ls -la
     
  5. Remove the lock file, run the command:

    rm .lck-xxxx

    Where xxxx is a sequence of numbers. For example, lck-001455721.

    Note: Ensure the virtual machine is not running.
     
  6. Power on the virtual machine.
Note: When the virtual machine is powered on, you see a new .lck-xxxx file generated in the virtual machine folder. This is expected behavior.
 
 
Scenario 2:
 
Below steps can be followed to fix the NFS lock issue, if we are aware of the ESXi host name holding the lock:
 

1. Power off the VM.

2. vMotion the VM to the locked host.

3. Power on the VM from the locked host.

 
 

If this issue persists or no lock files are present to apply this resolution to, file a Support Request with VMware Support and note this KB article ID (2037507) in the problem description. For more information, see Creating and managing Broadcom support cases.