Taking a snapshot of a virtual machine with virtual disk over 2TB on an EMC VMAX array results in corrupted redo logs

search cancel

Taking a snapshot of a virtual machine with virtual disk over 2TB on an EMC VMAX array results in corrupted redo logs

book

Article ID: 306930

calendar_today

Updated On: 11-17-2024

Products

VMware vSphere ESXi

Issue/Introduction

In virtual machines running on datastores hosted in an EMC VMAX SAN array, you experience these issues:

When taking a snapshot of a virtual machine with a virtual disk (vmdk) of 2 TB or greater in size, the redo logs are corrupted.
The affected virtual machine displays a message that a virtual disk is corrupted and the virtual machine is powered off.
The virtual machine fails to power on after the snapshot is taken.
The virtual machine with a corrupted redo log fails to access any data on the corrupted virtual disk.
Snapshot consolidation or deletion on the virtual machine fails.

Environment

VMware vSphere ESXi 5.5
VMware vSphere ESXi 5.1

Cause

The virtual machine redo log contains the metadata information about the virtual machine snapshot. By default, disks larger than 2 TB or linked clones have SE-Sparse type snapshot virtual disks. With and SE-Sparse snapshot virtual disk files, WriteSame operations on the VMAX array may silently fail. When the ESXi 5.5 host detects the corruption, virtual machine power on/power off tasks are restricted.

Resolution

This is a known issue affecting ESXi 5.1 and 5.5.

Note: Before modifying or deleting any virtual machine files, VMware recommends that you create a full backup of the files.

If your environment is at risk and you want to avoid this issue:

Temporarily disable VAAI handling on EMC VMAX LUNs. The issue does not appear to occur when VAAI is disabled.

To disable VAAI for a specific storage type, use the esxcli command to delete the existing hardware acceleration claim rules.

For more information, see: Disabling the VAAI functionality in ESXi/ESX (1033665)

To work around this issue, use one of these options:

Do not take snapshots of virtual machines with virtual disks larger than 2 TB if the virtual disk is residing on the VMAX datastore.
Migrate any virtual disks larger than 2 TB to an alternate SAN array, if possible.
Delete the corrupt redo log and manually revert to any older snapshot may allow the virtual machine to power on again.

Warning: If all redo logs are corrupt, the virtual machine must be recovered from backup.

Additional Information

Disabling the VAAI functionality in ESXi/ESX

Feedback

Was this article helpful?

thumb_up Yes

thumb_down No