REDO log corruption is reported after restoring the virtual machine
search cancel

REDO log corruption is reported after restoring the virtual machine

book

Article ID: 345684

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • Virtual machine fails to Power ON after restoring it from snapshot LUN or replica LUN.

  • The REDO log corruption message is reported in the hostd logs and virtual machine log.

  • In the /var/log/hostd.log file, you see entries similar to:

    2016-10-07T12:21:22.123Z [3F040B90 verbose 'Vmsvc.vm:/vmfs/volumes/5384db8c-715d6034-cfb4-d485647708cc/VM_NAME/VM_NAME.vmx'] Handling message _vmx2: The redo log of VM_NAME-000001.vmdk is corrupted. If the problem persists, discard the redo log.

  • In the vmware.log file, you see entries similar to:

    2016-09-27T21:13:14.437Z| vcpu-0| I120: Msg_Question:
    2016-09-27T21:13:14.437Z| vcpu-0| I120: [msg.hbacommon.corruptredo] The redo log of VM_NAME_2-000001.vmdk is corrupted. If the problem persists, discard the redo log.
    2016-09-27T21:13:14.437Z| vcpu-0| I120: ----------------------------------------
    2016-09-27T21:19:21.047Z| vcpu-0| I120: MsgQuestion: msg.hbacommon.corruptredo reply=0
    2016-09-27T21:19:21.057Z| vcpu-0| I120: Exiting because of failed disk operation.
    2016-09-27T21:20:21.144Z| vcpu-0| W110: A core file is available in "/vmfs/volumes/501aa3a8-4ca80d42-791f-002655503e15/VM_NAME/vmx-zdump.000"
    2016-09-27T21:20:21.152Z| vcpu-0| W110: Writing monitor corefile "/vmfs/volumes/501aa3a8-4ca80d42-791f-002655503e15/VM_NAME/vmmcores.gz"

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.


Environment

VMware ESXi 3.5.x Installable
VMware ESXi 4.1.x Embedded
VMware ESX Server 3.0.x
VMware vSphere ESXi 6.0
VMware ESXi 4.0.x Installable
VMware ESX Server 2.5.x
VMware ESX Server 2.0.x
VMware ESXi 4.0.x Embedded
VMware ESX Server 2.1.x
VMware ESX Server 3.5.x
VMware ESXi 3.5.x Embedded
VMware ESX Server 1.5.x
VMware ESX Server 1.x
VMware ESX 4.1.x
VMware vSphere ESXi 5.5
VMware ESX 4.0.x
VMware vSphere ESXi 5.1
VMware vSphere ESXi 5.0
VMware ESXi 4.1.x Installable

Cause

This issue might occur when powering on a VM with vSphere snapshots in these scenarios:
  • The VMFS datastore on which the VM is hosted is a replica of a different VMFS datastore.
  • The VM is restored from a storage based snapshot of a VMFS datastore or of an NFS share, before powering on.
The delta disk metadata in-memory of vSphere host includes the delta disk header. Updates to the header of the delta disks happen in memory as required and the changes are written to disk only upon certain events such as snapshot consolidation or when the delta disk is closed.
Storage snapshot operations and storage replications are transparent to ESXi hosts. If the storage snapshot used to restore a VM was taken before the snapshot header changes were flushed to disk, then delta disk metadata on the restored VM is not consistent. Similarly, a synchronous or asynchronous replica of VMFS filesystem might not contain all header changes as they might have not been flushed at the moment replication of underlying LUN was stopped.
Note: The corrupt redo log message just indicates that the in-memory delta disk metadata was not in-sync with on-disk metadata when the storage snapshot was taken or at the time when LUN replication was stopped.

Resolution

To avoid this issue, follow the best practices:

  • Ensure that the virtual machines are not running on snapshot when a storage snapshot is taken.
  • Perform storage array or filer snapshots during times when virtual machines snapshots are less likely to happen
  • Restore virtual machines from snapshot LUN that were taken when the virtual machines were either powered off or when there were no snapshots runinng on VM.
  • Minimize the frequency of storage array or filer snapshots to have lesser overlap with manual or backup initiated VM snapshots.



Additional Information

Even when a storage snapshot has the ability to take snapshot of a VM, only crash consistency with respect to concurrent I/Os is guaranteed.
This means that all in-flight I/Os on the array or filer will be allowed to complete before taking the storage snapshot. This does not involve:
  • Virtual machine quiescing (either filesystem or application consistent).
  • In-memory state of the delta disks in vSphere (as explained above).

To be altered when this article is updated, Subscribe to Document in the Actions box.

还原虚拟机之后报告重做日志损坏
仮想マシンの復旧後に REDO ログの破損が報告される