vMotion fails at 20% for Microsoft Cluster "shared disk" virtual machines configured with Raw Device Mapping with error "A general system error occurred: Invalid fault"
search cancel

vMotion fails at 20% for Microsoft Cluster "shared disk" virtual machines configured with Raw Device Mapping with error "A general system error occurred: Invalid fault"

book

Article ID: 318900

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • Unable to migrate Microsoft Cluster (MSCS /WSFC) virtual machines configured with RDM with error message "A general system error occurred: Invalid fault"  at 20%
  • In hostd.log on the source ESXi Host  you may see the below
Note: hostd.log is located at /var/run/log/hostd.log
 
[YYYY-MM-DDTHH:MM:SS] info hostd[2099840] [Originator@6876
sub=Vmsvc.vm:/vmfs/volumes/datastore/testVM_folder/testVM.vmx opID=xxxx-xxxx-xxxx-xxxxx:xxxxx-xx-xx-xx-xxxx
user=vpxuser: vc/ADMIN] VMotionPrepare (993176069561948474): Sending '
from' srcIp=xx.xx.xx.xx dstIp=xx.xx.xx.xx, type=1, encrypted=^C,
remoteThumbprint=xx:xx:xx:xx:xx:xx:xx:xx:
-->          ],
-->          message = "Unable to load configuration file
'/vmfs/volumes/datastore/testVM_folder/testVM.vmx'."
-->       },
-->       (vmodl.LocalizableMessage) {
-->          key = "msg.dictionary.load.openFailed",
-->          arg = (vmodl.KeyAnyValue) [
-->             (vmodl.KeyAnyValue) {
-->                key = "1",
-->                value =
"/vmfs/volumes/datastore/testVM_folder/testVM.vmx"
-->             },
-->             (vmodl.KeyAnyValue) {
-->                key = "2",
-->                value = "16 (Device or resource busy)"
-->             }
-->          ],
-->          message = "Cannot open file
"/vmfs/volumes/datastore/testVM_folder/testVM.vmx":
Device or resource busy.
--> "
-->       }
-->    ],
-->    file =
"/vmfs/volumes/datastore/testVM_folder/testVM.vmx"
-->    msg = "Unable to load configuration file
'/vmfs/volumes/datastore/testVM_folder/testVM.vmx'.
--> Unable to load configuration file
'/vmfs/volumes/datastore/testVM_folder/testVM.vmx'.
--> Cannot open file
"/vmfs/volumes/datastore/testVM_folder/testVM.vmx":
Device or resource busy.
--> "

[YYYY-MM-DDTHH:MM:SS] info hostd[2099840] [Originator@6876 sub=vm
opID=kmzd0uox-1035653-auto-m746-h5:70231853-2b-01-20-9cb7 user=vpxuser: vc/ADMIN] DictionaryLoad: Cannot open file
"/vmfs/volumes/datastore/testVM_folder/testVM.vmx":
Device or resource busy.
[YYYY-MM-DDTHH:MM:SS] info hostd[2099840] [Originator@6876 sub=vm
opID=kmzd0uox-1035653-auto-m746-h5:70231853-2b-01-20-9cb7 user=vpxuser:vc/ADMIN] VigorOfflineReload: Unable to read '/vmfs/volumes/datastore/testVM_folder/testVM.vmx'.

Configuration is invalid.
 
  • In vmkernel.log you may find the below:
Note: vmkernel.log is located at /var/run/log/vmkernel.log

[YYYY-MM-DDTHH:MM:SS] cpu18:2098081)NMP: nmp_ResetDeviceLogThrottling:3580:
Error status H:0x0 D:0x18 P:0x0 Sense Data: 0x0 0x0 0x0 from dev
"naa.600xxxxxxxxxxxxxxxxxxxxxxxxxxxxx" occurred 1797 times(of
1797 commands
)


 And

[YYYY-MM-DDTHH:MM:SS] cpu5:2098235)ScsiDeviceIO: 3449: Cmd(0x459b40e24a00) 0x1a, CmdSN 0x814639 from world 0 to dev "naa.600xxxxxxxxxxxxxxxxxxxxxxxxxxxxx" failed H:0x8 D:0x0 P:0x0 Invalid sense data: 0x0 0x50 0x3a.
 
 


Environment

VMware vSphere ESXi 6.5
VMware vSphere ESXi 7.0.x
VMware ESXi 6.5.x
VMware vSphere ESXi 7.0.0
VMware ESXi 6.7.x
VMware vSphere ESXi 6.7

Cause

  • This occurs because the RDM LUN where the MSCS /WSFC VMs are pointing to is not set to be Perennially Reserved on any of the ESXi Hosts (source or destination)
  • VMware recommends to have Perennially Reservation RDM LUNs set to true when LUNs are used by MSCS /WSFC (Microsoft Clustering)
  • For more information about VMware recommendation for WSFC, see vSphere MSCS Setup Checklist

Resolution

To fix this issue, you need to mark RDM LUN as Perennially Reserved on both ESXi host (Source and Destination)

To mark the MSCS LUNs as perennially reserved:

  1. Determine which RDM LUNs are part of an MSCS cluster. From the vSphere Client, select a virtual machine that has a mapping to the MSCS cluster RDM devices.
     
  2. Edit your virtual machine settings and navigate to your Mapped RAW LUNs. In this example, Hard disk 2:

  1. In the Physical disk, there is the specification of the device in use as RDM (that is, the VML ID).
    Take note of the VML ID, which is a globally unique identifier for your shared device.
  2. Identify the naa.id for this VML using this command:  esxcli storage core device list
For example:

esxcli storage core device list

naa.6589cfc000000a17ac02aae02067e747
   Display Name: FreeNAS iSCSI Disk (naa.6589cfc000000a17ac02aae02067e747)
   Has Settable Display Name: true
   Size: 40960
   Device Type: Direct-Access
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/naa.6589cfc000000a17ac02aae02067e747
   Vendor: FreeNAS
   Model: iSCSI Disk
   Revision: 0123
   SCSI Level: 6
   Is Pseudo: false
   Status: degraded
   Is RDM Capable: true
   Is Local: false
   Is Removable: false
   Is SSD: false
   Is VVOL PE: false
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: unknown
   Attached Filters:
   VAAI Status: supported
   Other UIDs: vml.010001000030303530353630313031303830310000695343534920
   Is Shared Clusterwide: true
   Is SAS: false
   Is USB: false
   Is Boot Device: false
   Device Max Queue Depth: 128
   No of outstanding IOs with competing worlds: 32
   Drive Type: unknown
   RAID Level: unknown
   Number of Physical Drives: unknown
   Protection Enabled: false
   PI Activated: false
   PI Type: 0
   PI Protection Mask: NO PROTECTION
   Supported Guard Types: NO GUARD SUPPORT
   DIX Enabled: false
   DIX Guard Type: NO GUARD SUPPORT
   Emulated DIX/DIF Enabled: false
  1. Use the esxcli command to mark the device as perennially reserved:
    esxcli storage core device setconfig -d naa.id --perennially-reserved=true
For example:

esxcli storage core device setconfig -d naa.6589cfc000000a17ac02aae02067e747 --perennially-reserved=true

Note: For vSphere 7.x, see the Change Perennial Reservation Settings section of the vSphere Storage Guide.
  1. To verify that the device is perennially reserved, run this command:

    esxcli storage core device list -d naa.id

    In the output of the esxcli command, search for the entry Is Perennially Reserved: true. This shows that the device is marked as perennially reserved.

    For example:

    esxcli storage core device list -d naa.6589cfc000000a17ac02aae02067e747

    naa.6589cfc000000a17ac02aae02067e747
       Display Name: FreeNAS iSCSI Disk (naa.6589cfc000000a17ac02aae02067e747)
       Has Settable Display Name: true
       Size: 40960
       Device Type: Direct-Access
       Multipath Plugin: NMP
       Devfs Path: /vmfs/devices/disks/naa.6589cfc000000a17ac02aae02067e747
       Vendor: FreeNAS
       Model: iSCSI Disk
       Revision: 0123
       SCSI Level: 6
       Is Pseudo: false
       Status: degraded
       Is RDM Capable: true
       Is Local: false
       Is Removable: false
       Is SSD: false
       Is VVOL PE: false
       Is Offline: false
       Is Perennially Reserved: true
       Queue Full Sample Size: 0
       Queue Full Threshold: 0
       Thin Provisioning Status: unknown
       Attached Filters:
       VAAI Status: supported
       Other UIDs: vml.010001000030303530353630313031303830310000695343534920
       Is Shared Clusterwide: true
       Is SAS: false
       Is USB: false
       Is Boot Device: false
       Device Max Queue Depth: 128
       No of outstanding IOs with competing worlds: 32
       Drive Type: unknown
       RAID Level: unknown
       Number of Physical Drives: unknown
       Protection Enabled: false
       PI Activated: false
       PI Type: 0
       PI Protection Mask: NO PROTECTION
       Supported Guard Types: NO GUARD SUPPORT
       DIX Enabled: false
       DIX Guard Type: NO GUARD SUPPORT
       Emulated DIX/DIF Enabled: false

     
  2. Repeat the procedure for each Mapped RAW LUN that is participating in the MSCS /WSFC cluster.

    Note: The configuration is permanently stored with the ESXi host and persists across restarts. To remove the perennially reserved flag, run this command:

    esxcli storage core device setconfig -d naa.id --perennially-reserved=false
  3. Migration will work now



Additional Information