On Windows Server, failure to quiesce VM before snapshot
search cancel

On Windows Server, failure to quiesce VM before snapshot

book

Article ID: 301572

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This article helps customers deal with quiesced snapshot issues in Windows Server VMs.

Symptoms:
When taking a quiesced snapshot of Windows Server,  snapshot creation fails with this four-sentence message: "An error occurred while quiescing the virtual machine. See the virtual machine's event log for details. An error occurred while taking a snapshot: Failed to quiesce the virtual machine. An error occurred while saving the snapshot: Failed to quiesce the virtual machine."
 
If VMware Tools VSS logging is enabled (https://knowledge.broadcom.com/external/article?legacyId=1007873), the vmware.log shows that some writers fail at freeze stage with VSS_WRITER_STATE VSS_WS_FAILED_AT_FREEZE (9). Here is a sample log:
 
2022-04-27T03:25:07.393Z In(05) vcpu-0 - Guest: [ debug] [vmvss:vmvss] [6424] VDSHelper::ForEachVDSPack():303: return (0x0)
2022-04-27T03:25:07.394Z In(05) vcpu-1 - Guest: [ debug] [vmvss:vmvss] [6424] VDSHelper::ForEachVolume():337: return (0x0)
2022-04-27T03:25:07.399Z In(05) vcpu-1 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::GatherWriterStatus():2359: enter
2022-04-27T03:25:07.400Z In(05) vcpu-0 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::WaitForOperation():3630: enter
2022-04-27T03:25:07.425Z In(05) vcpu-2 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::CheckWriterStatus():2420: enter
2022-04-27T03:25:07.426Z In(05) vcpu-2 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::CheckWriterStatus():2466: Task Scheduler Writer (1)
2022-04-27T03:25:07.426Z In(05) vcpu-2 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::CheckWriterStatus():2466: VSS Metadata Store Writer (1)
2022-04-27T03:25:07.426Z In(05) vcpu-2 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::CheckWriterStatus():2466: Performance Counters Writer (1)
2022-04-27T03:25:07.427Z In(05) vcpu-2 - Guest: [ warning] [vmvss:vmvss] [7796] CVmSnapshotRequestor::CheckWriterStatus():2454: writer System Writer in failed state: res = 0x800423f2, err = 0x1, error =
2022-04-27T03:25:07.427Z In(05) vcpu-2 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::CheckWriterStatus():2466: System Writer (9)
2022-04-27T03:25:07.428Z In(05) vcpu-2 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::LogComponentError():800: enter
2022-04-27T03:25:07.435Z In(05) vcpu-2 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::LogComponentError():835: failed call: ret = compEx2->GetFailure(&writerErr, &appErr, &appMessage, NULL), result = 0x80070057
2022-04-27T03:25:07.435Z In(05) vcpu-2 - Guest: [ debug] [vmvss:vmvss] [7796] CVmSnapshotRequestor::DoSnapshotSet():2148: failed call: ret = GatherWriterStatus(), result = 0x80042301

 
If VSS trace is collected during snapshot creation (https://docs.microsoft.com/en-us/windows/win32/vss/using-tracing-tools-with-vss),  it's found that Windows spends about 50 seconds in THAW_KTM stage.
 
[ 0:43:17.017 P:0C78 T:1534 REGREGSC(1348) GEN] Event name: THAW_KTM (Enter)
[ 0:44:08.626 P:0348 T:15B8 WRTWRTIC(2279) WRITER] Aborting due to timeout
[ 0:44:08.626 P:0C78 T:15E0 WRTWRTIC(2279) WRITER] Aborting due to timeout
[ 0:44:08.641 P:0C78 T:1580 WRTWRTIC(2279) WRITER] Aborting due to timeout
[ 0:44:08.641 P:06A8 T:14AC WRTWRTIC(2279) WRITER] Aborting due to timeout
[ 0:44:08.641 P:0C78 T:1448 WRTWRTIC(2279) WRITER] Aborting due to timeout
[ 0:44:08.673 P:047C T:164C WRTWRTIC(2279) WRITER] Aborting due to timeout
[ 0:44:08.739 P:0C78 T:1534 REGREGSC(1348) GEN] Event name: THAW_KTM (Leave) 

 
Considering that the total freeze timeout is 60 seconds by default, the long delay in THAW_KTM stage causes some VSS writers to time out and fail at freeze stage

Environment

VMware vSphere 7.0.x
VMware vSphere 6.x

Cause

This is a known bug in Windows KTM (kernel transaction manager).

Resolution

For Windows 2016 and earlier, there will not be any fix from Microsoft.
For Windows 2019, the issue is not reproduced after installing Windows KB5014669 https://www.catalog.update.microsoft.com/Search.aspx?q=KB5014669
For Windows 2022,  the issue is under investigation by Microsoft.

Workaround:
None currently except in Windows 2019.