EMC VMAX and PowerMAX arrays experience VMFS corruption when using VAAI XCOPY on vSphere 6.x
search cancel

EMC VMAX and PowerMAX arrays experience VMFS corruption when using VAAI XCOPY on vSphere 6.x

book

Article ID: 326263

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • Following a Storage VMotion or VM cloning operation that utilizes the VAAI XCOPY feature, the destination Datastore may become corrupt and be taken offline.
  • Recognition of mismatch in devID on the destination datastore:

    2019-08-12T10:56:01.122Z cpu42:662296)LVM: 4120: [naa.60000970000297900266533030303732:1] devID mismatch:
    2019-08-12T10:56:01.122Z cpu42:662296)LVM: 4121: Cached: <naa.60000970000297900266533030303732:1>
    2019-08-12T10:56:01.122Z cpu42:662296)LVM: 4122: Read: 5d09cd35-########-####-##########e2

     
  • Next, you will observe the VMFS volume being forced offline similar to:

    2019-08-12T10:56:01.122Z cpu42:662296)LVM: 6493: Forcing APD unregistration of devID 5d09cdc6-########-####-##########e2 in state 1.
    2019-08-12T10:56:01.122Z cpu42:662296)LVM: 5866: Could not open device , vol [5d09cdc6-########-####-#########e2, 5d09cdc6-########-####-#########e2, 1]: Device does not contain a logical volume
    2019-08-12T10:56:01.125Z cpu42:662296)Vol3: 3155: Failed to get object 28 type 1 uuid 5d09cdcc-########-####-##########e2 FD 0 gen 0 :No filesystem on the device
    2019-08-12T10:56:01.125Z cpu42:662296)WARNING: Fil3: 1389: Failed to reserve volume f530 28 1 5d09cdcc ######## ######## ######b3 0 0 0 0 0 0 0

     
  • You may see the following messages on hosts in the cluster that did not initiate the clone/migration similar to:

    2019-08-12T11:28:21.515Z cpu18:65844)WARNING: Vol3: 2422: FS 5d09ce55-########-####-##########e2 uuid change detected, possibly got re-formatted by other host, new uuid 5d09cd3b-########-####-##########e2

    Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware vSphere ESXi 6.0
VMware vSphere ESXi 6.7
VMware vSphere ESXi 6.5

Resolution

 

VMware and Dell|EMC engineering teams have worked closely to develop fixes that will prevent the datastore from being overwritten during an XCopy session. It is recommended to load both the Dell microcode fix and the ESXi patch before re-enabling XCOPY from the host or array.

Patches for ESXi 6.5 and 6.7 have been created that intercepts all attempts to overwrite block 0 on EMC Symmetrix based storage arrays (VMAX/PowerMAX). With the fix, if the array detects an XCOPY attempt to overwrite block 0, the array will reject this attempt and return a custom SCSI sense code to the host. If the ESXi hosts have the patches installed, it will interpret this rejection and fail the XCOPY operation entirely instead of reverting to the software datamover and the XCOPY operation will need to be attempted again.

The Dell EMC fix was released in February 2020 to address this corruption issue. As of May 4, 2020, Dell released microcode enhancement 105521 which supersedes enhancement 104495. This fix is referenced in Dell's DTA 537000: Please read the Dell EMC Knowledgebase article 000002467

VMware has released patches for ESXi 6.5 & 6.7 that protects and reports attempts to alter the VMFS metadata region:

ESXi 6.7: https://docs.vmware.com/en/VMware-vSphere/6.7/rn/esxi670-202004002.html

ESXi 6.5: https://docs.vmware.com/en/VMware-vSphere/6.5/rn/esxi650-202007001.html


Workaround:
Until you are able to install the fix from Dell EMC, the workaround is to disable the VAAI XCOPY feature on the ESXi hosts.

To disable hardware accelerated move using the vSphere Web Client, follow the below procedure:
  1. Browse to the host in the vSphere Web Client navigator.
  2. Click the Configure tab.
  3. Under System, click Advanced System Settings.
  4. Change the value for DataMover.HardwareAcceleratedMove to 0 (disabled):
For more information, see Disabling Hardware Accelerated Move (XCOPY) in ESXi (2146567).

Additional Information

Impact/Risks:
VMFS corruption