VMware vCenter Site Recovery Manager (SRM) Testfailover and Failover operations fail and the vmware-dr log files report the error: No host is compatible with the virtual machine
search cancel

VMware vCenter Site Recovery Manager (SRM) Testfailover and Failover operations fail and the vmware-dr log files report the error: No host is compatible with the virtual machine

book

Article ID: 330858

calendar_today

Updated On:

Products

VMware Live Recovery

Issue/Introduction

Symptoms:

  • VMware vCenter Site Recovery Manager (SRM) Testfailover and Failover operations fail
  • In the SRM vmware-dr log files, you see entries similar to:
YYYY-MM-DDT15:03:56.285-04:00 [05796 verbose 'StorageProvider' opID=308DC337-000002EA] Recovered datastore 'ds:///vmfs/volumes/532c7a6a-########-####-#########c3/' as 'datastore-146' in datacenter 'datacenter-2'
:
YYYY-MM-DDT15:03:56.286-04:00 [05380 verbose 'StorageProvider' opID=308DC337-000002EA] Updating embedded paths in virtual machine files on datastore 'datastore-146': ["/vmfs/volumes/532c7a6a-########-####-#########c3" => 'vim.Datastore:datastore-146', "/vmfs/volumes/532c7a75-aba5404b-2b1a-9cb654866dc3" => 'vim.Datastore:datastore-147']
:
YYYY-MM-DDT15:03:58.517-04:00 [05368 warning 'StorageProvider' opID=308DC337-000002EA] Failed to update embedded paths in 3 VM files on datastore 'datastore-146':

(vim.UpdateVirtualMachineFilesResult.FailedVmFileInfo) [
--> (vim.UpdateVirtualMachineFilesResult.FailedVmFileInfo) {
--> dynamicType = <unset>,
--> vmFile = "/vmfs/volumes/533c5ef8-#########-####-##########f8/SRMTestJiva02-Thick/SRMTestJiva02-Thick.vmx",
--> fault = (vim.fault.InvalidVmConfig) {
--> dynamicType = <unset>,
--> faultCause = (vmodl.MethodFault) null,
--> property = "snapshot.dict",
--> msg = "Invalid virtual machine configuration.",

2014-04-02T15:04:01.538-04:00 [05808 warning 'PlaceholderVmManager' opID=308DC337-000002EA:3fda:db60] [PlaceholderVm] Failed patch VM '[vim.VirtualMachine:vm-61]' after reload: (vim.fault.InvalidState) {
-- > dynamicType = < unset >,
-- > faultCause = (vmodl.MethodFault) null,
-- > msg = "The operation is not allowed in the current state.",
-- > }

(vim.fault.NoCompatibleHost) {
-- > dynamicType = < unset >,
-- > faultCause = (vmodl.MethodFault) null,
-- > host = (vim.HostSystem) [
-- > 'vim.HostSystem:host-31'
-- > ],
-- > error = (vmodl.MethodFault) [
-- > (vim.fault.InvalidState) {
-- > dynamicType = < unset >,
-- > faultCause = (vmodl.MethodFault) null,
-- > msg = "The operation is not allowed in the current state.",
-- > }
-- > ],
-- > msg = "No host is compatible with the virtual machine.",
-- > }
  • In the vmkernel log files (On the ESXi host that the virtual machine is running), located at /var/log, you see errors similar to:
YYYY-MM-DDT19:04:20.277Z cpu31:36858 opID=f9cc1551)Vol3: 2612: Failed to get object 28 type 3 uuid 533c5ef8-#########-####-##########f8 FD 2c08944 gen d :Not found
YYYY-MM-DDT19:04:20.278Z cpu31:36858 opID=f9cc1551)Vol3: 2612: Failed to get object 28 type 3 uuid 533c5ef8-#########-####-##########f8 FD 2c08944 gen d :Not found
:
:
YYYY-MM-DDT19:05:03.906Z cpu16:34294)WARNING: Fil3: 15068: Found invalid object on 533c5ef8-#########-####-##########f8 < FD c0 r0 > expected < FD c606 r16 >
YYYY-MM-DDT19:05:03.906Z cpu16:34294)Vol3: 2612: Failed to get object 28 type 2 uuid 533c5ef8-#########-####-##########f8 FD 4009784 gen 15 :Not found
YYYY-MM-DDT19:05:03.908Z cpu16:34294)WARNING: Fil3: 15068: Found invalid object on 533c5ef8-#########-####-##########f8 < FD c0 r0 > expected < FD c606 r16 >
  • In the hostd log files (On the ESXi host that the virtual machine is running), located at /var/log, you see errors similar to:
YYYY-MM-DDT19:03:56.580Z [FFFCBB70 info 'DiskLib' opID=308DC337-000002EA-76-80 user=vpxuser] DISKLIB-LINK : "/vmfs/volumes/533c5ef8-#########-####-##########f8/SRMTest06-Thick/SRMTest06-Thick-flat.vmdk" : failed to open (The file specified is not a virtual disk).

YYYY-MM-DDT19:03:57.446Z [FFFCBB70 info 'vm:SNAPSHOT: Snapshot_PathPrefixChange: failed to fix paths in dictionary /vmfs/volumes/533c5ef8-#########-####-##########f8/SRMTest06-Thick/SRMTest06-Thick.vmx' opID=308DC337-000002EA-76-80 user=vpxuser] Dictionary problem (6).

YYYY-MM-DDT19:04:00.470Z [6B440B70 info 'vm:VMHSVMLoadConfig failed: "/vmfs/volumes/533c5ef8-#########-####-##########f8/SRMTest06-Thick/SRMTest06-Thick.vmx'] is not a valid virtual machine configuration file. (VMX file is corrupt)

YYYY-MM-DDT19:04:00.470Z [6AAC2B70 info 'Vmsvc.vm:/vmfs/volumes/533c5ef8-#########-####-##########f8/SRMTest06-Thick/SRMTest06-Thick.vmx'] Foundry_[Create|Open]Ex failed: Error: (4002) Cannot read the virtual machine configuration file

YYYY-MM-DDT19:04:00.471Z [6AAC2B70 info 'Vmsvc.vm:/vmfs/volumes/533c5ef8-#########-####-##########f8/SRMTest06-Thick/SRMTest06-Thick.vmx'] Failed to load virtual machine.
  • From the preceding log entries, you can see that the recovered Virtual Machine File System (VMFS) datastores are affected by data corruption.



Cause

This issue occurs when there are third-party data replication problems from Site-1 array to Site-2 storage arrays.

Resolution

Note: It is difficult to find the symptoms shown by datastore corruption. These troubleshooting steps can be used to find the location in which the data corruption is occurring:
 
To work around this issue:
  1. Verify that the data corruption does not originate from the source VMFS datastore. To do this, you can use VMware On-disk Metadata Analyzer (VOMA) tool, the tool is included with ESXi 5.1 and later. For older ESXi 5.0 or 4.x version, gather a binary dump of the VMFS datastore metadata and provide it to Broadcom Technical Support for analysis. 

    Note: The VMFS datastore must be unmounted in order to run the VOMA tool. For more information, see Using vSphere On-disk Metadata Analyzer (VOMA) to check VMFS metadata consistency (2036767). If the source datastore is corrupted, it must be restored from backup or initiate data recovery operations. Both of these operations are beyond the scope of this document, so contact your storage vendor for further details.

  2. If the corruption is not found in the source VMFS datastore, then the data corruption is associated with the destination datastore. This indicates that the corruption is caused by replication process or by a replication configuration problem. Perform these steps to troubleshoot the issue:

    1. Provide the DR site replicated LUN to the ESXi hosts manually.

      Note: The preceding step helps to simplify the troubleshooting.

    2. Mount the VMFS datastore manually (that is located on that device).
    3. Register the virtual machines on that datastore manually.
    4. Attempt to change the configuration of a virtual machine. For example, try to change the network configuration.

      Note: Same action is performed by SRM when re-configuring the recovered virtual machines before powering them ON.

    5. Attempt to power ON the virtual machines. If a problem is encountered, check the VMkernel and Hostd log files for specific corruption entries (as listed in the Symptoms section) to locate the files that are corrupted.

  3. After the location of the data corruption is identified, contact your storage vendor for further details as they may have special tools to perform data integrity checks at the storage array level. Perform these steps to find replication problems:
    1. Create a file on the source VMFS datastore.
    2. Calculate the MD5 checksum on this source file.
    3. Replicate/Synchronize the changes from the source VMFS device to its replica device on the DR site.
    4. Provide a copy or snapshot device (associated with the DR site device) to an ESXi host.
    5. Mount the VMFS datastore located at the copy or snapshot device.
    6. Run the MD5 checksum on the same file on the DR site. If the source and destination MD5 sums do not match, then it indicates data replication corruption.



Additional Information

For more information regarding mounting of a VMFS datastore manually, see the Mount a VMFS Datastore with an Existing Signature section in the vSphere 5 Documentation Center.