VM powered off due to consolidation failure
search cancel

VM powered off due to consolidation failure

book

Article ID: 398041

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Snapshot consolidation fails with errors indicating disk lock conflicts.

The failure occurred during the removal of a backup snapshot initiated by a third-party backup tool, with error messages such as:

Cannot open the disk ... or one of the snapshot disks it depends on. Locking conflict for file ... Kernel open flags are 0x8.

Logs from hostd and vmkernel show that:
The .vmdk file was locked by a VM process due to a stale NFC session initiated by the backup solution.

NFC file handles were not released as expected (NfcFile_Open occurred without a corresponding NfcFile_Close or Nfc_UnregisterFileHandle).

NFC session held the lock on the file but never closed it.

Locking conflict occurred when the host attempted consolidation. 

All VMkernel-level NFS locks were valid, confirming the issue was not with NFS layer, but NFC session behavior.

 

Snippet 

======================

Hostd.log 


YYYY-MM-DDTHH:MM:SS.SSSZ Db(167) Hostd[2100341] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/############.vmx opID=7bf025ae-29-5815 sid=528cb7ea user=vpxuser:VSPHERE.LOCAL\#######] Remove snapshot request received: _cohesity_m_snapshot-8799707389132884-44808, 0
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100341] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/############.vmx opID=7bf025ae-29-5815 sid=528cb7ea user=vpxuser:VSPHERE.LOCAL\#######] State Transition (VM_STATE_ON -> VM_STATE_REMOVE_SNAPSHOT)
YYYY-MM-DDTHH:MM:SS.SSSZ Db(167) Hostd[2100341] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/############.vmx opID=7bf025ae-29-5815 sid=528cb7ea user=vpxuser:VSPHERE.LOCAL\#######] Consolidate disks after snapshot removal.
YYYY-MM-DDTHH:MM:SS.SSSZ Db(167) Hostd[2100356] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/############.vmx] Handling vmx message 22380479: Locking conflict for file "/vmfs/volumes/#########.vmdk". Kernel open flags are 0x8. Owner process on this host is world ID 4889615 with world name vmx-vcpu-0:#######.
YYYY-MM-DDTHH:MM:SS.SSSZ Db(167) Hostd[2099749] --> Locking conflict for file "/vmfs/volumes/##########.vmdk". Kernel open flags are 0x8. Owner process on this host is world ID 4889615 with world name vmx-vcpu-0:#######.
YYYY-MM-DDTHH:MM:SS.SSSZ Db(167) Hostd[2099749] --> Locking conflict for file "/vmfs/volumes/#########.vmdk". Kernel open flags are 0x8. Owner process on this host is world ID 4889615 with world name vmx-vcpu-0:#######.
YYYY-MM-DDTHH:MM:SS.SSSZ Db(167) Hostd[2099749] --> Cannot open the disk '/vmfs/volumes/#########.vmdk' or one of the snapshot disks it depends on.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100356] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 60951 : Error message on ########### on ####### in datacenter: An operation required the virtual machine to quiesce and the virtual machine was unable to continue running.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100354] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/############.vmx] Answered question 223804792025-04-27T23:02:49.803Z Db(167) Hostd[2100354] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/#######/#######/#######.vmx] Handling vmx message 22380480: Performing disk cleanup. Cannot power off.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100354] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 60952 : Error message on ############ on ############.com in ha-datacenter: Performing disk cleanup. Cannot power off.
YYYY-MM-DDTHH:MM:SS.SSSZ Db(167) Hostd[2100352] [Originator@6876 sub=Vigor.Vmsvc.vm:/vmfs/volumes/############.vmx] Consolidate Disks message: An error occurred while consolidating disks: Unable to save snapshot file.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2099904] [Originator@6876 sub=Libs] VigorTransportReconnect: VM /vmfs/volumes/############.vmx does not appear to be running
YYYY-MM-DDTHH:MM:SS.SSSZ Db(167) Hostd[2099904] [Originator@6876 sub=Vigor.Vmsvc.vm:/vmfs/volumes/############.vmx] Vigor client state callback OFFLINE reason Failed to establish transport connection.

vmkernel.log 
==================

# All the NFS lock for the VM seems to be fine. 

# These are the nbd operation ID which got created for the VM on the affected date. 

YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100346] [Originator@6876 sub=Libs opID=nbdmode-000000da810c06d0] [NFC INFO]NfcFile_Open: session=DA83E4BC40 hdl=DA824A69E0 Local filename =  '/vmfs/volumes/#######/#######_5.vmdk'
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC INFO]NfcFile_Open: session=DA82136C70 hdl=DA819ECB40 Local filename =  '/vmfs/volumes/#######/#######_5.vmdk'
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100357] [Originator@6876 sub=Libs opID=nbdmode-000000da815adcc0] [NFC INFO]NfcFile_Open: session=DA82C61070 hdl=DA82DBB7A0 Local filename = '/vmfs/volumes/#######/#######_5.vmdk'
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100345] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcFile_Open: session=DA26E9AC50 hdl=DA838C1CA0 Local filename = '/vmfs/volumes/#######/#######_5.vmdk'

 

 opID=nbdmode-000000da815d2ce0]

YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100272] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0-TicketID:52dce4e2-afb0-3b4d-9c30-8637dc7531ef ExpirationDate:4015676015218 RemainingUseCount:0] [NFC INFO]NfcProcessSessionParams:SessionParams flags: 0x3
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100272] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0-TicketID:52dce4e2-afb0-3b4d-9c30-8637dc7531ef ExpirationDate:4015676015218 RemainingUseCount:0] [NFC INFO]NfcProcessSessionParams: server session supports interruption.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100272] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0-TicketID:52dce4e2-afb0-3b4d-9c30-8637dc7531ef ExpirationDate:4015676015218 RemainingUseCount:0] [NFC INFO]NfcProcessSessionParams: server session supports switch.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC INFO]NfcAioServerInit: Initializing async session.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC INFO]NfcAioInitSession: Disabling read-ahead buffer since the AIO buffer size of 65536 is >= the read-ahead buffer size of 65536. Explicitly setting flag 'NFC_AIO_SESSION_NO_NET_READ_AHEAD'
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC WARNING]NfcAioInitServerSession: AIO session is created using deprecatedNFC_AIO_SESSION_SET_SRV_LOGLEVEL flag that is going to be ignored.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC INFO]NfcAioProcessOpenSessionMsg: Initialized AIO session : DA82806C90.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC INFO]Nfc sessionId = DA82136C70, memUsage = 327680.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC INFO]NfcAioProcessSetResPoolMsg: Start to set resource pool tag.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC INFO]NfcAioProcessSetResPoolMsg: setsockopt resource pool tag(1) succeeded.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC INFO]NfcFile_Open: session=DA82136C70 hdl=DA819ECB40 Local filename = '/vmfs/volumes/##########.vmdk'    <-- This session was not closed. It still had a lock on the VMDK.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100350] [Originator@6876 sub=Libs opID=nbdmode-000000da815d2ce0] [NFC INFO]Nfc_RegisterFileHandle: sessionId=DA82136C70 fh=DA819ECB40( '/vmfs/volumes/#######/#######_5.vmdk')

 

opID=nbdmode-000000da8403be10
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100356] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10-TicketID:522d6f6a-07f3-2016-fa07-b23d14168035 ExpirationDate:4017510367860 RemainingUseCount:0] [NFC INFO]NfcProcessSessionParams:SessionParams flags: 0x3
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100356] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10-TicketID:522d6f6a-07f3-2016-fa07-b23d14168035 ExpirationDate:4017510367860 RemainingUseCount:0] [NFC INFO]NfcProcessSessionParams: server session supports interruption.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100356] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10-TicketID:522d6f6a-07f3-2016-fa07-b23d14168035 ExpirationDate:4017510367860 RemainingUseCount:0] [NFC INFO]NfcProcessSessionParams: server session supports switch.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100271] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcAioServerInit: Initializing async session.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100271] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcAioInitSession: Disabling read-ahead buffer since the AIO buffer size of 65536 is >= the read-ahead buffer size of 65536. Explicitly setting flag 'NFC_AIO_SESSION_NO_NET_READ_AHEAD'
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100271] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC WARNING]NfcAioInitServerSession: AIO session is created using deprecatedNFC_AIO_SESSION_SET_SRV_LOGLEVEL flag that is going to be ignored.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100271] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcAioProcessOpenSessionMsg: Initialized AIO session : DA83EA2B00.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100271] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]Nfc sessionId = DA26E9AC50, memUsage = 327680.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100271] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcAioProcessSetResPoolMsg: Start to set resource pool tag.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100271] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcAioProcessSetResPoolMsg: setsockopt resource pool tag(1) succeeded.
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100345] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcFile_Open: session=DA26E9AC50 hdl=DA838C1CA0 Local filename = ' '/vmfs/volumes/#######/#######_5.vmdk'
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2100345] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]Nfc_RegisterFileHandle: sessionId=DA26E9AC50 fh=DA838C1CA0( '/vmfs/volumes/#######/#######_5.vmdk')
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2099834] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcAioCloseFile: Closing file '[IND_HYD_DATAST01] /#######/#######_5.vmdk'
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2099834] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcFile_Close: session=DA26E9AC50 hdl=DA838C1CA0  filename= '/vmfs/volumes/#######/#######_5.vmdk'
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2099834] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]NfcFileConcludeFileStats: session=DA26E9AC50 Local filename =  '/vmfs/volumes/#######/#######_5.vmdk'
YYYY-MM-DDTHH:MM:SS.SSSZ In(166) Hostd[2099834] [Originator@6876 sub=Libs opID=nbdmode-000000da8403be10] [NFC INFO]Nfc_UnregisterFileHandle: sessionId=DA26E9AC50 found=1 fh=DA838C1CA0( '/vmfs/volumes/#######/#######_5.vmdk')

 

Environment

ESXi 8.x 

Cause

The snapshot consolidation failure was caused by a third-party backup session holding a persistent lock on the VM disk file without releasing it.

The lock was held due to an incomplete NFC file session lifecycle (NfcFile_Open without NfcFile_Close), resulting in consolidation failure when VMware attempted to acquire an exclusive lock.

Resolution

Fix via Patch:

Workaround (if patching is not feasible):

  • Disable the NFS lock upgrade functionality by adding the following VM advanced setting:
  • consolidate.upgradeNFS3Locks = "FALSE"