Snapshot consolidation task fails with error "Unable to access file since it is locked An error occurred while consolidating disks: Failed to lock the file".
The error repeats for every attempts of the consolidation.
In the /vmfs/volumes/<datastore_name>/<vm_name>/vmware.log will have the below events logged in:
YYYY-MM-DDThh:mm:ss.417Z In(05) vmx - SnapshotVMX_Consolidate: Starting online snapshot consolidate operation.YYYY-MM-DDThh:mm:ss.541Z In(05) vmx - ConsolidateFillSnapDiskTransferArray: Item 0 source: /vmfs/volumes/vsan:<vsan Datastore UUID>/<VSAN OBJECT ID - VM HOME DIR>/Test-VM-000255.vmdk dest: /vmfs/volumes/vsan:<vsan Datastore UUID>/<VSAN OBJECT ID - VM HOME DIR>/Test-VM.vmdk. Cumulative size of redo logs (including meta-data): 6883860480.YYYY-MM-DDThh:mm:ss.631Z In(05) vcpu-0 - [msg.disklib.numLinks.maxReached] This virtual machine has 255 or more redo logs in a single branch of its snapshot tree. The maximum supported limit has been reached, creating new snapshots will not be allowed. To create new snapshots, please delete old snapshots or consolidate the redo logs.YYYY-MM-DDThh:mm:ss.745Z In(05) vcpu-0 - ConsolidateEnd: Snapshot consolidate complete: Failed to lock the file (5).
In the /var/run/log/vmkernel.log, below entries can be seen during the consolidation attempt:
YYYY-MM-DDThh:mm:ss.551Z cpu26:19100970 opID=74f54281)DLX: 2650: vol '<VSAN OBJECT ID - VM HOME DIR>', lock at 125706240: Lock type: 10C00001. Read Lock(s) held on a file on volume <VSAN OBJECT ID>. numHolders:1 gblNumHolders:0,$YYYY-MM-DDThh:mm:ss.551Z cpu26:19100970 opID=74f54281)[type 10c00001 offset 125706240 v 13942, hb offset 3215360gen 263, mode 2, owner 00000000-00000000-0000-000000000000 mtime 11919512num 1 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0YYYY-MM-DDThh:mm:ss.551Z cpu26:19100970 opID=74f54281)DLX: 2651: vol '<VSAN OBJECT ID - VM HOME DIR>', lock at 125706240: Lock type: 10C00001. owner(s) MAC: ##:##:##:##:##:##:YYYY-MM-DDThh:mm:ss.551Z cpu26:19100970 opID=74f54281)[type 10c00001 offset 125706240 v 13942, hb offset 3215360gen 263, mode 2, owner 00000000-00000000-0000-000000000000 mtime 11919512num 1 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0YYYY-MM-DDThh:mm:ss.551Z cpu26:19100970 opID=74f54281)Fil3: 5033: Lock failed on file: .<VSAN OBJECT ID - VM HOME DIR>.lck on vol '##########' with FD: <FD c288 r4>
VMware vSphere with Snapshot based backup solutions.
One or more snapshot disks in the snapshot chain of the impacted VM is still locked by a process other than the VM's vmx process itself.
In a snapshot based backup solution, the proxy VMs need to provision the snapshot disk in read-only mode to take backup of the VM. During this time, the ESXi host where the backup proxy VM is running places a read-only lock on the snapshot flat file.
Due to some issues, if the backup job did not release the lock after the backup is completed, the consolidation task will fail and the snapshot chain continues to grow until it reaches the max of 255.
The consolidation task will succeed only when there are no locks placed on any of the snapshot file by any other process other than the VM's own vmx process.
Determine the locks held on the files by following the below KBs.
Investigating Virtual Machine file locks on ESXi Host(s)
Investigating virtual disk file locks on vSAN
If the "lsof" command shows the lock is held by hostd process, restart hostd process to release the lock.