VM fails to power on due to redo log corruption.
search cancel

VM fails to power on due to redo log corruption.

book

Article ID: 417410

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • The issue is faced after a vmotion failure with the error:  The operation cannot be allowed at the current time because the virtual machine has a question pending: 'msg.hbacommon.corruptredo:The redo log of vmname_1-000004.vmdk' is corrupted. If the problem persists, discard the redo log. '.
  • /vmfs/volumes/<datastore>/<vmname>/vmware.log reports the following error:
    YYYY-MM-DDTHH:MM:SS. In(05) vcpu-0 - [msg.hbacommon.corruptredo] The redo log of 'vmname_1-000004.vmdk' is corrupted. If the problem persists, discard the redo log.
    YYYY-MM-DDTHH:MM:SS. In(05) vcpu-0 - MsgQuestion: msg.hbacommon.corruptredo reply=0
  • Cloning the impacted snapshot disk with vmkfstools commands fails at 100% with the following error:
    vmkfstools -i /vmfs/volumes/<datastore>/<vmname>/vmname_1-000004.vmdk /vmfs/volumes/<datastore>/<vmname>/vmname_2.vmdk
    Cloning disk /vmfs/volumes/<datastore>/<vmname>/vmname_1-000004.vmdk...
    Clone: 100% done.Failed to clone disk: Invalid change tracker error code (7228).

Cause

File corruption is random in nature.
This issue might occur by various circumstances that include but are not limited to:
  • Hardware issues with the storage controller or storage device.
  • Connectivity issues between the ESX host and the storage device.
  • When the datastore containing the snapshot disks runs out of free disk space.

Resolution

  • Unmount the HDD from the VM configuration that shows failure in vmware.log
  • Edit the descriptor file to remove the ctk references
    root@esxi:/vmfs/volumes/<datastore>/<vmname>/vmname_1-000004.vmdk] vmname_1-000004.vmdk
    # Disk DescriptorFile
    version=5
    encoding="UTF-8"
    CID=########
    parentCID=########
    createType="vmfsSparse"
    parentFileNameHint="vmname_1-000003.vmdk"
    # Extent description
    RW ###### VMFSSPARSE "vmname_1-000004-delta.vmdk"
    
    # Change Tracking File
    changeTrackPath="vmname_1-000004.vmdk" <-------------------------- Comment out using #
    
    # The Disk Data Base
    #DDB
    
    ddb.iofilters = "spm"
    ddb.longContentID = "##############################"
    ddb.sidecars = "##############################"
  • Clone the impacted snapshot disk with vmkfstools
    vmkfstools -i /vmfs/volumes/<datastore>/<vmname>/vmname_1-000004.vmdk /vmfs/volumes/<datastore>/<vmname>/vmname_2.vmdk
  • Mount the cloned disk to the VM.
  • Power on the VM. 

Additional Information

If origianl run to clone disk fails with a message similar to:

failed to clone disk: Bad file descriptor (589833)

Then create a folder int he vm directory and mv the ctk files into it and try the clone again.

mkdir bkup

mv *ctk.vmdk ./bkup/

rerun vmkfstools command from resolution