VM unresponsive and Snapshot Consolidation Fails with a Timeout Errors on Large VM After Snapshot Deletion

search cancel

VM unresponsive and Snapshot Consolidation Fails with a Timeout Errors on Large VM After Snapshot Deletion

book

Article ID: 400216

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

Symptoms:

After deleting a snapshot on a large virtual machine (e.g., Oracle DB VM), VM will give a warning that requires consolidation.

When attempting to consolidate the VM fail the consolidation fails with errors such as GenericVmConfigFault and

An error occurred while consolidating disks: 110 (Connection timed out).

Under VM task

In vCenter logs /var/log/vmware/vpxd/vpxd.log will find similar error

--> Result:
--> (vim.fault.GenericVmConfigFault) {
-->    faultCause = (vmodl.MethodFault) null,
-->    faultMessage = (vmodl.LocalizableMessage) [
-->       (vmodl.LocalizableMessage) {
-->          key = "msg.snapshot.vigor.consolidate.error",
-->          arg = (vmodl.KeyAnyValue) [
-->             (vmodl.KeyAnyValue) {
-->                key = "1",
-->                value = "110 (Connection timed out)"
-->             }
-->          ],
-->          message = "An error occurred while consolidating disks: 110 (Connection timed out)."
-->       }
-->    ],
-->    reason = "An error occurred while consolidating disks: 110 (Connection timed out)."
-->    msg = "An error occurred while consolidating disks: 110 (Connection timed out)."
--> }
--> Args:
-->
[timestamp] info vpxd[06333] [Originator@6876 sub=cdrsPlmt opID=CdrsLoadBalancer-####################] Load-balance iteration 5525 scanned 9086120 vmotion 5 min 29 max 4663118 total 869590269 hosts 0 skipped 0 ######-###: false

Environment

VMware vSphere ESXi 7.0
VMware vSphere ESXi 8.0

Cause

This issue typically arises when a snapshot was taken during a datastore full condition and later removed manually after space was extended on the datastore
Residual Changed Block Tracking (CBT) metadata within the VM configuration and disk descriptor files can interfere with disk consolidation.
The default timeout may be insufficient when dealing with large multi-disk VMs.

Resolution

1. Remove CBT Metadata References:
Comment out changeTrackPath in the .vmdk and .vmx files:

Reference: Broadcom KB 322285

2. Refresh Metadata:
Take a new snapshot and immediately remove it. This refreshes internal metadata without reintroducing prior corruption.

3. Verify Snapshot Chain:
Check each VMDK in the VM's disk chain for integrity:

Reference: Broadcom KB 337269

4. Increase Consolidation Timeout:
Add the following setting to the VM’s .vmx file or via advanced configuration:

Reference: Broadcom KB 321362

5. Retry Consolidation:
Initiate the consolidation process again. Due to the VM's size and number of disks, this operation may take an extended amount of time.

Next Steps:

If consolidation fails again, retry consolidation on the affected disk(s).
Repeat until all disks are successfully consolidated.
Once completed, power on the VM and verify normal operation.
If none of the above options work, please perform a clone of the VM to resolve this issue.

Additional Information

This issue is commonly seen with large VMs that have 10 or more disks and where consolidation runs beyond the default timeout threshold.
CBT references may persist even after snapshot removal, especially when storage pressure or metadata corruption occurred.

Feedback

thumb_up Yes

thumb_down No