VM unresponsive and Snapshot Consolidation Fails with a Timeout Errors on Large VM After Snapshot Deletion
search cancel

VM unresponsive and Snapshot Consolidation Fails with a Timeout Errors on Large VM After Snapshot Deletion

book

Article ID: 400216

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

Symptoms:

After deleting a snapshot on a large virtual machine (e.g., Oracle DB VM), VM will give a warning that requires consolidation.

When attempting to consolidate the VM fail the consolidation fails with errors such as GenericVmConfigFault and

An error occurred while consolidating disks: 110 (Connection timed out)

Under VM task 

In vCenter logs /var/log/vmware/vpxd/vpxd.log will find similar error 

--> Result:
--> (vim.fault.GenericVmConfigFault) {
-->    faultCause = (vmodl.MethodFault) null,
-->    faultMessage = (vmodl.LocalizableMessage) [
-->       (vmodl.LocalizableMessage) {
-->          key = "msg.snapshot.vigor.consolidate.error",
-->          arg = (vmodl.KeyAnyValue) [
-->             (vmodl.KeyAnyValue) {
-->                key = "1",
-->                value = "110 (Connection timed out)"
-->             }
-->          ],
-->          message = "An error occurred while consolidating disks: 110 (Connection timed out)."
-->       }
-->    ],
-->    reason = "An error occurred while consolidating disks: 110 (Connection timed out)."
-->    msg = "An error occurred while consolidating disks: 110 (Connection timed out)."
--> }
--> Args:
-->
[timestamp] info vpxd[06333] [Originator@6876 sub=cdrsPlmt opID=CdrsLoadBalancer-####################] Load-balance iteration 5525 scanned 9086120 vmotion 5 min 29 max 4663118 total 869590269 hosts 0 skipped 0 ######-###: false

Environment

VMware vSphere ESXi 7.0
VMware vSphere ESXi 8.0

Cause

  • This issue typically arises when a snapshot was taken during a datastore full condition and later removed manually after space was extended on the datastore
  • Residual Changed Block Tracking (CBT) metadata within the VM configuration and disk descriptor files can interfere with disk consolidation.
  • The default timeout may be insufficient when dealing with large multi-disk VMs.

Resolution

1. Remove CBT Metadata References:
Comment out changeTrackPath in the .vmdk and .vmx files:

sed -i 's/^changeTrackPath/#changeTrackPath/' <vmdk file>

Reference: Broadcom KB 322285

 

2. Refresh Metadata:
Take a new snapshot and immediately remove it. This refreshes internal metadata without reintroducing prior corruption.

 

3. Verify Snapshot Chain:
Check each VMDK in the VM's disk chain for integrity:

vmkfstools -q -v1 <vmdk file>

Reference: Broadcom KB 337269

 

4. Increase Consolidation Timeout:
Add the following setting to the VM’s .vmx file or via advanced configuration:

diskLib.consolidateTimeOut = "30"

Reference: Broadcom KB 321362

 

5. Retry Consolidation:
Initiate the consolidation process again. Due to the VM's size and number of disks, this operation may take an extended amount of time.

 

Next Steps:

  • If consolidation fails again, retry consolidation on the affected disk(s).

  • Repeat until all disks are successfully consolidated.

  • Once completed, power on the VM and verify normal operation.

  • If none of the above options work, please perform a clone of the VM to resolve this issue.

Additional Information

  • This issue is commonly seen with large VMs that have 10 or more disks and where consolidation runs beyond the default timeout threshold.

  • CBT references may persist even after snapshot removal, especially when storage pressure or metadata corruption occurred.