Quiesced snapshot on Windows VM errors out with "Failed to quiesce the virtual machine" due to timeout waiting for VSS writers list
search cancel

Quiesced snapshot on Windows VM errors out with "Failed to quiesce the virtual machine" due to timeout waiting for VSS writers list

book

Article ID: 428895

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • When attempting to take a backup of a Windows virtual machine using quiesced snapshot, the task fails with below error while standard (non-quiesced) snapshots succeed without issue.

An error occurred while taking a snapshot: Failed to quiesce the virtual machine.

  • On the vm logs, quiesce snapshot failure due to 'timeout' is observed:

/vmfs/volumes/<datastore>/<vm_folder>/vmware.log

YYYY-MM-DDThh:mm:ss In(05) vmx - VigorTransportProcessClientPayload: opID=7###32c7-0-e##e seq=18##32: Receiving Snapshot.Take request.
YYYY-MM-DDThh:mm:ss In(05) vmx - SnapshotVMX_TakeSnapshot start: '#####', deviceState=0, lazy=0, quiesced=1, forceNative=0, tryNative=1, saveAllocMaps=0
YYYY-MM-DDThh:mm:ss In(05) vmx - Msg_Post: Warning
YYYY-MM-DDThh:mm:ss In(05) vmx - [msg.snapshot.quiesce.timeout] Timed out while quiescing the virtual machine.
YYYY-MM-DDThh:mm:ss In(05) vmx - ----------------------------------------

  • Executing vssadmin list writers from the Command Prompt(with administrator privileges) within the Windows Guest OS indicates that one or more VSS writers are in a state other than [1] Stable (e.g., Retryable error or Timed out). These writers continue to fail even after restarting the VSS service and re-attempting the snapshot.

    Example of a healthy VSS writer:

    Writer name: '<Writer name>'
    Writer Id: {#####-####-####}
    Writer Instance Id: {####-####-####}
    State: [1] Stable
    Last error: No error

    Example of a unhealthy VSS writer:

    Writer name: '<Writer name>'
    Writer Id: {#####-####-####}
    Writer Instance Id:  {####-####-####}
    State: [10] Failed
    Last error: Timed out


     
  • Upon enabling debug logging for vmware tools to gather more info and excluding the concerned VSS writers which continue to error out (for steps refer to Additional Information section), entries similar to below are observed where the ESXi hypervisor requests the VSS writers list along with other details from the Guest OS followed by entering into a waiting state, eventually failing with timeout:

YYYY-MM-DDThh:mm:ss In(05) vmx - SNAPSHOT: SnapshotPrepareTakeDoneCB: Prepare phase complete (The operation completed successfully)
...
YYYY-MM-DDThh:mm:ss In(05) vcpu-0 - ToolsBackup: changing quiesce state: IDLE -> STARTED
...
YYYY-MM-DDThh:mm:ss In(05) vcpu-0 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::PreProcess():1298: App quiescing requested: true, Writable snapshot requested: true.
...
YYYY-MM-DDThh:mm:ss In(05) vcpu-1 - Guest: [   debug] [vmvss:vmvss] [72#2] CVmSnapshotRequestor::GatherWriterList():2130:enter
...
YYYY-MM-DDThh:mm:ss In(05) vcpu-1 - Guest: [   debug] [vmvss:vmvss] [72#2] CVmSnapshotRequestor::WaitForOperation():3907: enter
YYYY-MM-DDThh:mm:ss In(05) vcpu-3 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::QueryStatus():1825: enter
YYYY-MM-DDThh:mm:ss In(05) vcpu-3 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::NeedWaitForSnapshotCompleted():1491: enter
YYYY-MM-DDThh:mm:ss In(05) vcpu-1 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::QueryStatus():1825: enter
...
YYYY-MM-DDThh:mm:ss In(05) vcpu-1 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::NeedWaitForSnapshotCompleted():1491: ente
...
YYYY-MM-DDThh:mm:ss In(05) vcpu-0 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::NeedWaitForSnapshotCompleted():1491: enter
YYYY-MM-DDThh:mm:ss In(05) vmx - Msg_Post: Warning
YYYY-MM-DDThh:mm:ss In(05) vmx - [msg.snapshot.quiesce.timeout] Timed out while quiescing the virtual machine.
YYYY-MM-DDThh:mm:ss In(05) vmx - ----------------------------------------
YYYY-MM-DDThh:mm:ss In(05) vcpu-0 - ToolsBackup: changing quiesce state: FINISHING -> ERROR_WAIT
YYYY-MM-DDThh:mm:ss In(05) vcpu-0 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::Abort():230: enter
YYYY-MM-DDThh:mm:ss In(05) vmx - ToolsBackup: changing quiesce state: ERROR_WAIT -> IDLE
YYYY-MM-DDThh:mm:ss In(05) vmx - ToolsBackup: changing quiesce state: IDLE -> DONE
...
YYYY-MM-DDThh:mm:ss In(05) vcpu-0 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::FinalRelease():181: enter
YYYY-MM-DDThh:mm:ss In(05) vcpu-2 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::WaitForUnregistration():3980: enter
YYYY-MM-DDThh:mm:ss In(05) vcpu-2 - Guest: [   debug] [vmvss:vmvss] [12##8] CVmSnapshotRequestor::Abort():230: enter
YYYY-MM-DDThh:mm:ss In(05) vcpu-2 - Guest: [ warning] [vmvss:vmvss] [72#2] CVmSnapshotRequestor::WaitForOperation():3953: Async operation result: Operation aborted
YYYY-MM-DDThh:mm:ss In(05) vcpu-2 - Guest: [   debug] [vmvss:vmvss] [72#2] CVmSnapshotRequestor::DoSnapshotSet():2319: failed call: ret = WaitForOperation(async), result = 0x80004004
YYYY-MM-DDThh:mm:ss In(05) vcpu-2 - Guest: [    info] [vmvss:vmvss] [72#2] CVmSnapshotRequestor::CheckForAbort():718: Canceling operation.

Cause

This issue occurs when the VMware Snapshot Provider initiates a request to the Windows Volume Shadow Copy Service (VSS), but timed out waiting for response from Guest OS. This behavior can be triggered when one or more VSS writers fail to respond to the Guest OS within the default timeout period. The error 0x80004004 (E_ABORT) confirms that the operation was aborted by the Guest OS possibly due to these internal writer delays or failures.

Resolution

To resolve the issue, the VSS components within the Windows Guest OS must be investigated and stabilized to ensure timely responses to the VMware VSS component. Engage Microsoft Support to troubleshoot the specific failing VSS writers. Common OS-level remediation includes unregistering and re-registering problematic VSS writers. For further info, refer to Troubleshooting Volume Shadow Copy (VSS) quiesce related issues

Workaround

To workaround or isolate if the issue is caused due to any particular application, application-level consistency can be bypassed by reverting to file-system consistency using the below steps:

  1. Log in to the vCenter Server using the vSphere Client.

  2. Power off the virtual machine.

  3. Right-click the virtual machine and select Edit Settings.

  4. Select the VM Options tab.

  5. Expand the Advanced section.

  6. Click Edit Configuration next to Configuration Parameters.

  7. Click Add Configuration Params and enter the following:

    • Name: disk.EnableUUID

    • Value: FALSE (If the parameter already exists, modify its value to FALSE)

  8. Click OK to save the Configuration Parameters, and click OK again to save the VM settings.

  9. Power on the virtual machine and re-attempt the quiesced snapshot.

For further info, refer to Disabling Application Level Quiescing During Snapshots of Virtual Machines

Additional Information

Enabling debug logging for VMware Tools within a guest operating system

Disabling specific VSS writers with VMware Tools