Storage vMotion of a virtual machine fails after five minutes
search cancel

Storage vMotion of a virtual machine fails after five minutes

book

Article ID: 318891

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • The Storage vMotion of a virtual machine fails after five minutes with the error: vim.fault.GenericVmConfigFault. The failure occurs when task progress reaches between 30% and 44%.
  • In the /var/run/log/vpxa.log file, there are entries similar to:
    YYYY-MM-DDTHH:MM:SS.825Z info vpxa[BAD5B70] [Originator@6876 sub=Default opID=MigrationWizard-applyOnMultiEntity-######-###-##-##-##] [VpxLRO] -- ERROR task-153 -- vmotionManager -- vim.host.VMotionManager.initiateSourceEx: vim.fault.GenericVmConfigFault:
    --> Result:
    --> (vim.fault.GenericVmConfigFault) {
    --> faultCause = (vmodl.MethodFault) null,
    --> faultMessage = (vmodl.LocalizableMessage) [
    --> (vmodl.LocalizableMessage) {
    --> key = "msg.migrate.fail.dst",
    --> arg = <unset>,
    --> message = "The source detected that the destination failed to resume."
    --> }
    --> ],
    --> reason = "The source detected that the destination failed to resume."
    --> msg = "The source detected that the destination failed to resume."
    --> }
    --> Args:
    -->
    --> Arg migrationId:
    --> ###################
    --> Arg dstId:
    --> ######
    [...]
    YYYY-MM-DDTHH:MM:SS.829Z info vpxa[B96AB70] [Originator@6876 sub=Default opID=MigrationWizard-applyOnMultiEntity-######-###-##-##-##-##] [VpxLRO] -- ERROR task-152 -- -- vim.host.VMotionManager.initiateDestination:tracking: vim.fault.GenericVmConfigFault:
    --> Result:
    --> (vim.fault.GenericVmConfigFault) {
    --> faultCause = (vmodl.MethodFault) null,
    --> faultMessage = (vmodl.LocalizableMessage) [
    --> (vmodl.LocalizableMessage) {
    --> key = "msg.vigor.transport.connection.error",
    --> arg = <unset>,
    --> message = "Disconnected from virtual machine."
    --> },
    --> (vmodl.LocalizableMessage) {
    --> key = "msg.vigor.transport.connection.fail",
    --> arg = (vmodl.KeyAnyValue) [
    --> (vmodl.KeyAnyValue) {
    --> key = "1",
    --> value = "9"
    --> },
    --> (vmodl.KeyAnyValue) {
    --> key = "2",
    --> value = "There is no VMware process running for config file /vmfs/volumes/########-####-############/vm/vm.vmx"
    --> }
    --> ],
    --> message = "Failed to establish transport connection (9): There is no VMware process running for config file /vmfs/volumes/########-####-############/vm/vm.vmx."
    --> },
    --> (vmodl.LocalizableMessage) {
    --> key = "msg.asyncsocket.remotedisconnect",
    --> arg = <unset>,
    --> message = "Remote disconnected"
    --> }
    --> ],
    --> reason = "Disconnected from virtual machine."
    --> msg = "Disconnected from virtual machine."
    --> }
    --> Args:
    -->
  • In the /var/run/log/hostd.log file, there are entries similar to:
    YYYY-MM-DDTHH:MM:SS.287Z error hostd[1DB40B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/########-####-############/vm/vm.vmx opID=MigrationWizard-applyOnMultiEntity-######-###-##-##-##-#### user=vpxuser:vpxuser] Expected permission (3) for /vmfs/volumes/########-####-############/vm-########.vswp.##### not found in domain 10
    YYYY-MM-DDTHH:MM:SS.287Z info hostd[1DB40B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/########-####-############/vm/vm.vmx opID=MigrationWizard-applyOnMultiEntity-######-###-##-##-##-#### user=vpxuser:vpxuser] VM is in state VM_STATE_IMMIGRATING
    

Environment

VMware vCenter Server 6.5.x
VMware vSphere ESXi 6.5

Cause

  • The security domain policy of the destination ESXi host contains the correct permissions for the virtual machine after the migration is complete, but not during the migration.
  • Certain paths (such as the swap file) may be in a transient state while the configuration file is not yet finalized. As a result, hostd detects a false positive when verifying the content of the config file and terminates the storage vMotion task.

Resolution

  • This is a known issue affecting the ESXi 6.5 version. 

  • This issue is resolved in ESXi 6.5 U1 version, and available at Broadcom Downloads.

  • To work around this issue without performing an upgrade, add this option to the /etc/vmware/hostd/config.xml file to disable the logic that triggers the false positive.
    <config>
    <plugins>
    <vmsvc>
    <enforceVmxSandbox> false </enforceVmxSandbox>
    </vmsvc>
    </plugins>
    </config>
Note: The hostd process will need to be restarted for the option to take effect.