A virtual machine was discovered to have a snapshot chain 32 levels deep.
All consolidation efforts failed with the following error:
Operation failed!
Task name Remove snapshot
Target <VM name>
Status An error occurred while deleting a snapshot: One of the disks in this virtual machine is already in use by a virtual machine or by a snapshot.
The virtual machine could not be snapshotted an further.
Backups were failing.
Consolidation efforts issues via the vCenter GUI failed.
Upon investigation the snapshots were all related to the backup appliance.
Turning off the backup appliance proxy did not release the consolidation and it still failed.
When the locks were investigated (see KB: Investigating Virtual Machine file locks on ESXi Host(s)) the locks all originated from the host where the VM was running.
On the host in question a lock check (lsof) returned all the locks being held by the same vmx process where the VM was running.
It would appear that a process on the host locked all the files, blocking consolidation attempts. This was not cleared by a host reboot.
The only solution was to manually clone the disks individually up to the N-1 snapshot. (The data on a running VM only writes to the last snapshot in a chain and the previous content should never change.)
This was performed leveraging the vmkfstools -i command.
After each disk was cloned to N-1, a new VMX file was created for the recovered virtual machine pointing to a snapshot reference.
The source VM was then powered off, causing a service outage.
The last snapshots in each chain were copied to the new VM directory and made to point to the cloned disks below them; making the new VMX file now valid.
Then new virtual machine was then powered on, it assumed the identity of the original, restoring original service.
After in-guest data validation the one snapshot on the new virtual machine was consolidated via the vCenter GUI. Clearing all issues.
Pending further data validation the original virtual machine was purged from the datastore, freeing up and reclaiming the space on the datastore.