Storage vMotion of a virtual machine fails after five minutes
search cancel

Storage vMotion of a virtual machine fails after five minutes

book

Article ID: 318891

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

Symptoms:
If you fail to connect an ESXi host to vCenter Server due to an error faultcause = (vmodl.methodfault) null see  Remediating an ESX host fails with the error: A general system error occurred: Invalid argument.
  • Storage vMotion of a virtual machine fails after 5 minutes with the error:

    vim.fault.GenericVmConfigFault
     
  • Storage vMotion of a virtual machine fails at 30% to 44%.
     
  • In the vpxa.log file, there are entries similar to:

    2017-02-07T19:57:37.825Z info vpxa[BAD5B70] [Originator@6876 sub=Default opID=MigrationWizard-applyOnMultiEntity-240543-ngc-4e-01-8e] [VpxLRO] -- ERROR task-153 -- vmotionManager -- vim.host.VMotionManager.initiateSourceEx: vim.fault.GenericVmConfigFault:
    --> Result:
    --> (vim.fault.GenericVmConfigFault) {
    --> faultCause = (vmodl.MethodFault) null,
    --> faultMessage = (vmodl.LocalizableMessage) [
    --> (vmodl.LocalizableMessage) {
    --> key = "msg.migrate.fail.dst",
    --> arg = <unset>,
    --> message = "The source detected that the destination failed to resume."
    --> }
    --> ],
    --> reason = "The source detected that the destination failed to resume."
    --> msg = "The source detected that the destination failed to resume."
    --> }
    --> Args:
    -->
    --> Arg migrationId:
    --> 4333085553942529806
    --> Arg dstId:
    --> 188806
    [...]
    2017-02-07T19:57:37.829Z info vpxa[B96AB70] [Originator@6876 sub=Default opID=MigrationWizard-applyOnMultiEntity-240543-ngc-4e-01-d5-01] [VpxLRO] -- ERROR task-152 -- -- vim.host.VMotionManager.initiateDestination:tracking: vim.fault.GenericVmConfigFault:
    --> Result:
    --> (vim.fault.GenericVmConfigFault) {
    --> faultCause = (vmodl.MethodFault) null,
    --> faultMessage = (vmodl.LocalizableMessage) [
    --> (vmodl.LocalizableMessage) {
    --> key = "msg.vigor.transport.connection.error",
    --> arg = <unset>,
    --> message = "Disconnected from virtual machine."
    --> },
    --> (vmodl.LocalizableMessage) {
    --> key = "msg.vigor.transport.connection.fail",
    --> arg = (vmodl.KeyAnyValue) [
    --> (vmodl.KeyAnyValue) {
    --> key = "1",
    --> value = "9"
    --> },
    --> (vmodl.KeyAnyValue) {
    --> key = "2",
    --> value = "There is no VMware process running for config file /vmfs/volumes/4d0a870d-bb6a7fdb-30f4-842b2b0018bf/xatp-qpz/xatp-qpz.vmx"
    --> }
    --> ],
    --> message = "Failed to establish transport connection (9): There is no VMware process running for config file /vmfs/volumes/4d0a870d-bb6a7fdb-30f4-842b2b0018bf/xatp-qpz/xatp-qpz.vmx."
    --> },
    --> (vmodl.LocalizableMessage) {
    --> key = "msg.asyncsocket.remotedisconnect",
    --> arg = <unset>,
    --> message = "Remote disconnected"
    --> }
    --> ],
    --> reason = "Disconnected from virtual machine."
    --> msg = "Disconnected from virtual machine."
    --> }
    --> Args:
    -->

     
  • In the /var/log/hostd.log file, there are entries similar to:

    2017-02-01T14:02:40.287Z error hostd[1DB40B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/5887e349-bfb4f083-bba9-f4e9d4a573f0/xatp-qpz/xatp-qpz.vmx opID=MigrationWizard-applyOnMultiEntity-128876-ngc-3e-01-37-3841 user=vpxuser:vpxuser] Expected permission (3) for /vmfs/volumes/5755dfab-e1219172-8117-90b11c2fe741/xatp-qpz-72a2e578.vswp.41456 not found in domain 10
    2017-02-01T14:02:40.287Z info hostd[1DB40B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/5887e349-bfb4f083-bba9-f4e9d4a573f0/xatp-qpz/xatp-qpz.vmx opID=MigrationWizard-applyOnMultiEntity-128876-ngc-3e-01-37-3841 user=vpxuser:vpxuser] VM is in state VM_STATE_IMMIGRATING



    Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.


Environment

VMware vCenter Server Appliance 6.5.x
VMware vCenter Server 6.5.x
VMware vSphere ESXi 6.5

Cause

The security domain policy of the destination ESXi host contains the correct permissions for the virtual machine after the migration is complete, but not during the migration when certain paths (For example the swap file) may be in flux and the config file is not yet finalized. As a result, hostd detects a false positive when verifying the content of the config file and terminates the storage vMotion.
 
 

Resolution

This is know issue affecting ESXi 6.5.
This issue is resolved in ESXi 6.5 U1, available at VMware Downloads.
To work around this issue if you do not want to upgrade, add this option to the /etc/vmware/hostd/config.xml to disable the logic that triggers the false positive.
 
<config>
<plugins>
<vmsvc>
<enforceVmxSandbox> false </enforceVmxSandbox>
</vmsvc>
</plugins>
</config>
 
 
Note: The hostd process will need to be restarted for the option to take effect.


Additional Information

Enabling Fault Tolerance while a virtual machine is powered on fails with the error: Replay is unavailable for the current configuration
Snapshot commit task while vSAN 5.5 owner abdication is in progress results in virtual machine failure
仮想マシンの Storage vMotion が 5 分後に失敗する