Using Storage vMotion to migrate a virtual machine with many disks fails with timeout
search cancel

Using Storage vMotion to migrate a virtual machine with many disks fails with timeout

book

Article ID: 307365

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Storage vMotion for a virtual machine fails with an error "Operation timed out" between 5-10% progress:
  • /var/run/log/hostd.log on the ESXi host contains errors similar to the following one:
    [7196 foundryVM.c:10177]: Error VIX_E_INVALID_ARG in VixVM_CancelOps(): One of the parameters was invalid 'vm:/vmfs/volumes/<datastore>/<vm_folder>/<vm_name>.vmx' opID=9BED9F06-000002BE-9d] Failed to unset VM medatadata: FileIO error: Could not find file : /vmfs/volumes/<datastore>/<folder>/<vm_name>.vmx-aux.xml.tmp
  • In addition /var/run/log/vmkernel.log holds error messages like the ones below:
    vmkernel: 114:03:25:51.489 cpu0:4100)WARNING: FSR: 690: xxx S: Maximum switchover time (100 seconds) reached. Failing migration; VM should resume on source.
    vmkernel: 114:03:25:51.489 cpu2:10561)WARNING: FSR: 3281: xxx D: The migration exceeded the maximum switchover time of 100 second(s). ESX has preemptively failed the migration to allow the VM to continue running on the source host.
    vmkernel: 114:03:25:51.489 cpu2:10561)WARNING: Migrate: 296: xxx D: Failed: Maximum switchover time for migration exceeded(0xbad0109) @0x41800f61cee2
  • On the vCenter Server, /var/log/vmware/vpxd/vpxd.log has entries such as these:
    [yyyy-mm-dd hh:mm:ss.nnn tttt error 'App'] [MIGRATE] (migrateidentifier) vMotion failed: vmodl.fault.SystemError
    [yyyy-mm-dd hh:mm:ss.nnn tttt verbose 'App'] [VpxVmomi] Throw vmodl.fault.SystemError with:
    (vmodl.fault.SystemError) {
    dynamicType = <unset>,
    reason = "Source detected that destination failed to resume.",
    msg = "A general system error occurred: Source detected that destination failed to resume.

Environment

  • VMware vSphere ESXi 7.0.x
  • VMware vSphere ESXi 8.0.x

Cause

Any Storage vMotion process requires time to open, close and process all of the virtual machines disks.

If the VM has a larger number of disks, the amount of time required for these can exceed the default timeout of 100 seconds, especially if there are other operations running on the datastore, such as provisioning, migration or power operations.

Resolution

To work around this issue, the preconfigured timeout can be extended beyond the default value of 100 seconds.

This can be done by adding or updating the VM advanced option fsr.maxSwitchoverSeconds either via the vSphere Client or ESXi host client UI, or by manually updating the .vmx configuration file in the virtual machines folder on the datastore.

To update the option, make sure that the virtual machine is in a powered off state (if necessary, shut down the guest operating system first, power the VM off before attempting the change), then chose either of the following methods:

Process to modify fsr.maxSwitchoverSeconds using vCenter web Client/ESXi Host Client:

Follow the below steps to modify the fsr.maxSwitchoverSeconds option using the vCenter web Client or ESXi Host Client:

  1. Login to vSphere Client or ESXi Host Client.
  2. Locate the virtual machine in the Inventory.
  3. Power off the virtual machine if necessary.
  4. Right-click the virtual machine and select Edit Settings.
  5. Click the Advanced Parameters tab.
  6. Under "Attribute", fill in fsr.maxSwitchoverSeconds and under "Value" add a value higher than 100 (e.g. chose 200)
  7. Click on "Add" to add the new option to the VM configuration.
  8. To confirm this change, click [OK ], then power the VM back on again.

Process to modify fsr.maxSwitchoverSeconds option by editing .vmx file manually:

To modify the fsr.maxSwitchoverSeconds option by editing the .vmx file manually follow the steps below:

  1. Connect to the ESXi host where the VM is registered via SSH
  2. Change into the VM folder (/vmfs/volumes/<datastore>/<vm_name>/)
  3. Edit the .vmx file:
    # vi <vm_name>.vmx
  4. Verify that the option does not exist yet in the file, then add a new line at the end:
    fsr.maxSwitchoverSeconds = "200"
  5. Exit the VI editor (!wq) then power the VM on.

For more information, see Tips for editing a .vmx file.


Note: To edit a virtual machines configuration file, you need to power off the virtual machine, remove it from Inventory, make the changes to the vmx file, add the virtual machine back to inventory, and then power on the virtual machine again OR follow the article Reloading a vmx file without removing the virtual machine from inventory

Additional Information

During Storage vMotion, this issue can occur if the host is unable to copy the swap file activity of the virtual machine within a default time of 100 seconds from the source datastore to the destination datastore. It can also occur if the source datastore or the destination datastore are having any performance related issues because of heavy I/O activity.

For more information, see vMotion or Storage vMotion of a VM fails with the error: The migration has exceeded the maximum switchover time of 100 second(s).

See also Location of vCenter Server log files.