First, check for backup proxy servers in use. If there are then check to see if the affected disk is still mounted to the proxy server. If you find the disk attached to the proxy server then remove the disk from the proxy server ensuring "Delete from disk" is
NOT selected.
Note: There may be more than one proxy server in use. Make sure to check all proxy servers.
vSAN uses .lck files
The name of the .lck file will have the UUID of the VSAN object it represents as the file name.
To check the Descriptor, change the directory into the VM namespace.
For example:
cd /vmfs/volumes/vsanDatastore/<VM_Namespace>Then run
grep RW VMDiskName.vmdkYou'll see output similar to this,
# Extent description
RW 209715200 VMFS "vsan://e7c66759-680f-e86b-798d-a0369fa131f0"The UUID “e7c66759-680f-e86b-798d-a0369fa131f0” is the vSAN object representing the vdisk for that descriptor.
Note: If you get an error with device or resource busy then SSH to the host the VM is registered to and work from that host.
The following command will show all .<uuid>.lck files within the vSAN namespace directory :
#
ls -lah .*.lckYou'll see something similar to this,
-rw------- 1 root root 0 Jul 13 2017 .e7c66759-680f-e86b-798d-a0369fa131f0.lckThere may also be non-hidden lock files which you can diagnose similarly by running the following :
#
ls -lah *.lckRun
vmfsfilelockinfo -p .e7c66759-680f-e86b-798d-a0369fa131f0.lck which will show the lock details for this vSAN object
vmfsfilelockinfo Version 2.0
Looking for lock owners on ".e7c66759-680f-e86b-798d-a0369fa131f0.lck"
"<VMname>.vswp.lck" is locked in Exclusive mode by host having mac address ['xx:xx:xx:xx:xx:xx']
Trying to make use of Fault Domain Manager
----------------------------------------------------------------------
Found 6 ESX hosts using Fault Domain Manager.
----------------------------------------------------------------------
Searching on Host esxi1
Searching on Host esxi3
Searching on Host esxi4
Searching on Host esxi2
Searching on Host esxi6
Searching on Host esxi5
MAC Address : xx:xx:xx:xx:xx:xx
Host owning the lock on file is esxi5, lockMode : Exclusive
Total time taken : 0.11339905299246311 seconds.If no lock is found it will look like this:
vmfsfilelockinfo Version 2.0
Looking for lock owners on ".e7c66759-680f-e86b-798d-a0369fa131f0.lck"
".e7c66759-680f-e86b-798d-a0369fa131f0.lck" is not locked by any ESX host and is Free
Total time taken : 0.037906300276517868 seconds.Alternatively, you can also run the command vmkfstools -D against this file, which will show the lock details for this vSAN object as well.
Example:
#
vmkfstools -D .e7c66759-680f-e86b-798d-a0369fa131f0.lckYou should see output similar to this,
Lock [type 10c00001 offset 152799232 v 830, hb offset 3969024
gen 215, mode 1, owner 5c576ea9-e19f62dc-07eb-a0369fa12052 mtime 1107249
num 0 gblnum 0 gblgen 0 gblbrk 0]
Addr <4, 354, 1>, gen 3, links 1, type reg, flags 0, uid 0, gid 0, mode 600
len 0, nb 0 tbz 0, cow 0, newSinceEpoch 0, zla 4305, bs 8192The part in bold is the MAC address of the management VMkernel port. It should correspond to a host in the vSAN cluster.
Note: During the life-cycle of a powered on virtual machine, several of its files transitions between various legitimate lock states. The lock state mode indicates the type of lock that is on the file. The list of lock modes is:
- mode 0 = no lock
- mode 1 = is an exclusive lock (vmx file of a powered on virtual machine, the currently used disk (flat or delta), *vswp, and so on.)
- mode 2 = is a read-only lock (For example on the ..-flat.vmdk of a running virtual machine with snapshots)
- mode 3 = is a multi-writer lock (For example used for MSCS clusters disks or FT VMs)
Once you have the name of the host owning the lock SSH into that host and try restarting the management services hostd & vpxa with the following command
/etc/init.d/hostd restart && /etc/init.d/vpxa restartIf the lock is still present then run
lsof |grep <vmname> && ps|grep <vmname> For example:
[root@esxi4:~] lsof |grep cent7_2 && ps|grep cent7_2
7565528 vmx FILE 43 /vmfs/volumes/vsan:52bea6daf62777db-6515bb0268f25523/18db7d62-56b6-8186-64ba-0050560181e8/cent7_2.vmx.lck
7565528 vmx FILE 44 /vmfs/volumes/vsan:52bea6daf62777db-6515bb0268f25523/18db7d62-56b6-8186-64ba-0050560181e8/cent7_2.vmx
7565528 vmx FILE 45 /vmfs/volumes/vsan:52bea6daf62777db-6515bb0268f25523/18db7d62-56b6-8186-64ba-0050560181e8/cent7_2.vmx~
7565528 vmx FILE 82 /vmfs/volumes/vsan:52bea6daf62777db-6515bb0268f25523/18db7d62-56b6-8186-64ba-0050560181e8/cent7_2.nvram
7565529 0 vmm0:cent7_2
7565533 0 vmm1:cent7_2
7565535 7565528 vmx-filtPoll:cent7_2
7565536 7565528 vmx-mks:cent7_2
7565537 7565528 vmx-svga:cent7_2
7565538 7565528 vmx-vcpu-0:cent7_2
7565540 7565528 vmx-vcpu-1:cent7_2The number in bold is the world process ID we can kill this process by running kill <PID>. Make sure you run this command only from the host or hosts the VM is
NOT registered to.
Note: If the VM is powered down there should be no open files (lsof) or active processes (ps) for the VM. Additionally, you should only see open files or active processes on the host the VM is registered to when the VM is powered on.
If you find no locks with either of the lock commands you can try running
lsof |grep <vmname> && ps|grep <vmname> on all hosts in the cluster to see if you find a process on more than one host. If there are running processes then kill the process on any of the hosts that might have a hung process related to the VM.
Note: Make sure you're only killing the process on hosts the VM is NOT registered to especially if the VM is powered on.
If either vmfsfilelockinfo -p or vmkfstools -D commands finds no locks and lsof |grep <vmname> && ps|grep <vmname> finds no active process for the VM on any host and still getting file lock errors then we are dealing with a phantom lock and a rolling reboot of the cluster is required to clear the lock.
Workaround:
In order to check all the VM files and/or vSAN object lock files get the name of the files and/or vSAN object lock files that are locked, also which host is locking the files, run the following commands in the VM directory
for file in *; do echo ${file}; vmfsfilelockinfo -p ${file} |grep -i mode; doneOutput Example:Test-3f9d789c.hlog
Test-ec315dde.vswp
Test-ec315dde.vswp.lck
"Test-ec315dde.vswp.lck" is locked in Exclusive mode by host having mac address ['00:XX:56:XX:11:XX']
Host owning the lock on file is <Hostname>, lockMode : Exclusive
Test.nvram
"Test.nvram" is locked in Exclusive mode by host having mac address ['00:XX:56:XX:11:XX']
Host owning the lock on file is <Hostname>, lockMode : Exclusive
Test.vmdk
Test.vmsd
Test.vmxNormally, in the output, we will see the owner host, if you find a different host save the name of that host.
To check all .<uuid>.lck files run the below command :
for file in .*lck; do echo ${file}; vmfsfilelockinfo -p ${file} |grep -i mode; doneTo check all the files for VMs that have spaces in the name
run the below command :for file in *; do echo "${file}"; vmfsfilelockinfo -p "${file}" |grep -i mode; done