vMotion fails with the error: Migration to host failed with error timeout (0xbad0020)
search cancel

vMotion fails with the error: Migration to host failed with error timeout (0xbad0020)

book

Article ID: 335114

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vCenter Server

Issue/Introduction

  • vMotion task was failing with message:

The vMotion migrations failed because the ESX hosts were not able to connect over the vMotion network. Check the vMotion network settings and physical network configuration.

  • vpxd log (/var/log/vmware/vpxd/vpxd.log) contains the error:

Migration to host failed with error timeout (0xbad0020)
[MIGRATE] (2754482182) error while tracking VMotion progress (Timedout)

Environment

  • ESXi 8.x
  • ESX 9.x
  • vCenter 8.x
  • vCenter 9.x

Cause

  • Incorrect vMotion network configuration on the host can lead to a loss of network connectivity between hosts, preventing communication over the vMotion network.

Resolution

Investigate the following ESX vSwitch VLAN settings:

  1. Ensure correct VLAN mode is being used (EST or VST).
  2. If the physical switch is set to access port, ensure that the vMotion vSwitch VLAN ID is set to 0.
  3. If the physical switch is set to trunk mode, ensure that the correct VLAN ID is set on vMotion vSwitch across all hosts in the cluster.
  4. Utilize vSwitch CDP information to assess physical switch configuration.

For more information on vMotion failure, see Understanding and Troubleshooting vMotion (321009)

If the issue is not resolved even after performing the previous steps, remove the vMotion port group and recreate it.

1. Remove VMkernel Adapters: A port group cannot be deleted if a VMkernel interface is assigned to it. Note down the IPs configured for vMotion port group.

  1. Navigate to the Host in the vSphere Client.
  2. Go to Configure > Networking > VMkernel adapters.
  3. Identify the adapter used for vMotion (example: vmk1, vmk2).
  4. Select the adapter and click Remove.

Warning: Ensure there us another path for vMotion if the host is still in production, or migrate the service to a different adapter first.

2. Remove the Port Group: Once the VMkernel adapter is removed, follow the steps for the specific switch type being used:

A. For vSphere Distributed Switch (vDS)

    1. Go to the Networking inventory view (Globe icon).
    2. Locate the Distributed Switch and expand it.
    3. Right-click the vMotion Distributed Port Group.
    4. Select Delete.

Note: If "Resource in use" error is observed while deleting port group, check the Ports tab to see if any hidden ports or templates are still associated.

B. For vSphere Standard Switch:

    1. Navigate to the Host > Configure > Networking > Virtual switches.
    2. Locate the standard switch (e.g., vSwitch0).
    3. Find the vMotion port group in the list.
    4. Click the three dots (...) or the X icon next to the port group name and select Remove.

3. Create the Port Group

A. For vSphere Distributed Switch (vDS):

    1. Navigate to Networking in the vSphere Client.
    2. Right-click on the Distributed Switch and select Distributed Port Group > New Distributed Port Group.
    3. Name and Location: provide port group name (e.g., vMotion).
    4. Configure Settings: Set the VLAN ID if the vMotion traffic is on a tagged VLAN.
    5. Click Finish.

B. For vSphere Standard Switch:

    1. Navigate to the Host > Configure > Networking > Virtual switches.
    2. Click Add Networking.
    3. Select VMkernel Network Adapter and click Next.
    4. Select New standard switch (or an existing one).
    5. Set the Network label to vMotion and enter the VLAN ID.

4. Configure the VMkernel Adapter (Enabling vMotion): Assign an IP address and enable the "vMotion" service.

  1. Navigate to the Host > Configure > Networking > VMkernel adapters.
  2. Click Add Networking.
  3. Select VMkernel Network Adapter and click Next.
  4. Select Target Device: Choose the Select an existing network option and browse for the port group created in Step 3.
  5. Under Available services, check the box for vMotion.
  6. Ensure the TCP/IP stack is set to "Default" (or "vMotion" if a dedicated stack for routing is being used).
  7. IPv4 Settings: Select Use static IPv4 settings.
  8. Enter a dedicated IP address and Subnet Mask. Click Finish.

Note: The vMotion IP must be able to ping the vMotion IPs of all other hosts in the cluster.

5. Verify the VMkernel adapter status is Enabled and test connectivity from the ESXCLI:

vmkping -I vmk# <Destination_Host_vMotion_IP>

example: vmkping -I vmk1 192.###.###.###

Additional Information

vMotion migration can fail due to Migrate module in disabled state with message: Failed to initialize migration at source. Error 0xbad0020. Not supported.

Validate the migrate module is enabled by running command on ESXCLI: esxcli system module list | grep migrate

[root@<hostname>:~] esxcli system module list | grep vmotion
Name                            Is Loaded  Is Enabled
------------------------------  ---------  ----------
migrate                            true        false

Enable the migrate module by running the comand: esxcli system module set -e true -m migrate

Additional reference for ESXi 7.x: Discontinued stickybit files on ESX