Consolidating snapshot fails after third party backup succeeds
search cancel

Consolidating snapshot fails after third party backup succeeds

book

Article ID: 341620

calendar_today

Updated On: 02-05-2025

Products

VMware vSphere ESXi

Issue/Introduction

  • Upon third party backup operation, delta files are not consolidated successfully.
  • A backup is completed successfully, however the consolidation is not performed correctly due to file locks.
  • In the vmware.log file:

    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vmx| I125: VigorTransportProcessClientPayload: opID=26c####f-b0-#### seq=#####: Receiving Snapshot.Take request.

    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vmx| I125: SnapshotVMX_TakeSnapshot start: '__GX_BACKUP__', deviceState=0, lazy=0, quiesced=1, forceNative=0, tryNative=1, saveAllocMaps=0 cb=11754C0A20, cbData=117####6D0
    [...]

    (The backup is performed)

    [...]
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vmx| I125: VigorTransportProcessClientPayload: opID=3aa####6-b1-#### seq=#####: Receiving Snapshot.Consolidate request.
    [...]

    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: DISKLIB-LIB_BLOCKTRACK : Resuming change tracking.
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: DISKLIB-CBT : Initializing ESX kernel change tracking for fid 62####039.
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: DISKLIB-CBT : Successfuly created cbt node 2548154f-cbt.
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: DISKLIB-CBT : Opening cbt node /vmfs/devices/cbt/25####4f-cbt

    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: DISKLIB-LIB : Opened "/vmfs/volumes/#######-########-fdda-############/VM/VM_2.vmdk" (flags 0x8, type vmfsSparse).
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: AIOGNRC: Failed to open '/vmfs/volumes/#######-########-fdda-############/VM/VM_2-flat.vmdk' : Failed to lock the file (40003) (0x2013).
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: AIOMGR: AIOMgr_OpenWithRetry: Descriptor file '/vmfs/volumes/#######-########-fdda-############/VM/VM_2-flat.vmdk' locked (try 0)
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: AIOGNRC: Failed to open '/vmfs/volumes/#######-########-fdda-############/VM/VM_2-flat.vmdk' : Failed to lock the file (40003) (0x2013).
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: AIOMGR: AIOMgr_OpenWithRetry: Descriptor file '/vmfs/volumes/#######-########-fdda-############/VM/VM_2-flat.vmdk' locked (try 1)
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: AIOGNRC: Failed to open '/vmfs/volumes/#######-########-fdda-############/VM/VM_2-flat.vmdk' : Failed to lock the file (40003) (0x2013).
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: AIOMGR: AIOMgr_OpenWithRetry: Descriptor file '/vmfs/volumes/#######-########-fdda-############/VM/VM_2-flat.vmdk' locked (try 2)
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: AIOGNRC: Failed to open '/vmfs/volumes/#######-########-fdda-############/VM/VM_2-flat.vmdk' : Failed to lock the file (40003) (0x2013).
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: AIOMGR: AIOMgr_OpenWithRetry: Descriptor file '/vmfs/volumes/#######-########-fdda-############/VM/VM_2-flat.vmdk' locked (try 3)

    [...]
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: SNAPSHOT:Failed to open disk /vmfs/volumes/#######-########-fdda-############/VM/VM_2.vmdk : Failed to lock the file (16392)
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: DISKLIB-CBT : Shutting down change tracking for untracked fid 65####39.
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: DISKLIB-CBT : Successfully disconnected CBT node.

    [...]
    [YYYY-MM-DDTHH:MM:SS]<Time Zone>| vcpu-0| I125: Consolidate End: Snapshot consolidate complete: Failed to lock the file (5).
    [...]

Cause

  • When a backup is initiated the third party backup provider creates a snapshot of the target VM.
  • During a backup, opening a disk might take a longer time than the third party backup provider default timeout.
  • When this happens the third party backup provider considers the operation as a failure and as a consequence it tries to delete the snapshots.

    For example, a consolidation operation is initiated; the consolidation task will be started on the same disks which the third party backup provider attempted to open and caused the time out. Since the disks are still open, the consolidation will also fail, leaving stale delta files behind. The reason for the initial timeout and also why the disks are still in open state is because when taking a backup of a VMDK file using SAN transport mode, a "Map Disk Region" call is initiated by VDDK to obtain the mapping of the disk.

  • The use of SAN mode backups requires building up an entire mapping of the disk in-memory before allowing any I/O. This can take a very long time for large and/or fragmented disks.

Resolution

This is a known issue affecting the legacy SAN Transport Virtual Disk API Method utilized by 3rd Party Backup Software.

Currently, there is no resolution.

Workaround

  • Use NBD backup.
  • Note: NDB backup may be faster overall as there is no need to build the VMDK block mapping.

Additional Information