Stale file locks on VMDKs left by 3rd party backup solutions
search cancel

Stale file locks on VMDKs left by 3rd party backup solutions

book

Article ID: 416996

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • During backup operations, backup software creates a VM snapshot and attaches either the base or previous delta disk to the backup proxy VM for data extraction.
  • The problem arises when these disks are not detached from the proxy VM, or the disks (VDDK disk handle) are not properly closed after the backup completes, leaving a file lock on the original VMDK.
  • This lock prevents the source VM from consolidating its snapshots, leading to a buildup of snapshots on storage.

Manifestations:

Scenario 1:

  • Backup solution backup proxy VM, or other VM, holds a lock on the base disk, or on a previous snapshot file of the VM that has the consolidation issue
  • This prevents snapshot consolidation from succeeding
  • Two typical examples:  
    1. The backup job is performed with VDDK HotAdd transport mode,  the snapshot disk is hot-added into backup proxy VM.  Later, the snapshot disk is left on backup proxy VM after backup failure
    2. The backup job is performed with VDDK NBD transport mode,  the VDDK disk handles are not closed after backup job, which cause the snapshot disk locked

Scenario 2:

  • Backup solution does not complete a backup gracefully, or fails to clean up properly.  
  • A typical example: 
    • The backup job is killed or crashed,  it doesn't perform proper cleanup to recover the state and free the disk lock

Scenario 3:

  • Backup solution run wrong VADP workflow.
  • A typical example:
    • The snapshot disk is not closed properly before snapshot delete and consolidation

Environment

  • VMware vSphere ESXi
  • VMware ESX 9.x

Cause

The issue occurs mainly because a 3rd party backup solution does not follow the API workflow documented by VDDK programming guide

Resolution

  • Backup vendors should review their respective backup workflows to ensure that all VMDKs attached to proxy VMs during backup are reliably detached upon job completion. It is crucial that the backup process prevents any locks from remaining on base or delta disks after the backup.
  • The backup solution should ensure every disk is closed, to free the lock on ESX hosts after the disk open.
  • The backup solution should ensure every disk is closed before deleting the snapshot for backup.
  • It is recommended there are no additional snapshot operations (except backup snapshot) are running during backup . It's easy hit unexpected issues if the backup solution runs backup job for a VM with multiple snapshots,  
  • If the backup job is somehow killed/crashed, proper cleanup should be implemented to recover by calling VDDK functions VixDiskLib_Cleanup, VixDiskLib_EndAccess, etc.

Workaround:

If you hit issues with file locks left over by backup solutions, please follow instructions in Investigating Virtual Machine file locks on ESXi Host(s)

Additional Information