Evacuating a large VM configuration off an ESXi host is likely to hit VUM/vLCM timeouts
search cancel

Evacuating a large VM configuration off an ESXi host is likely to hit VUM/vLCM timeouts

book

Article ID: 321022

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
Upgrading hosts with large VM configurations (Greater than 256 vCPUs and 6TB vm memory) using VUM/vLCM may result in timeout if the host cannot enter maintenance mode within the default time limit set by VUM/vLCM. As a result the VUM/vLCM remediation operation will fail.

Here is a testing scenario to upgrade host with large VM configuration (480vCPU and 10TB VM memory) in a two host (host1 and host2) cluster:
  1. Guest OS SLES15SP0 VM configured with 480 vcpus/10 TB memory was powered on host1.
  2. Ran Memory/CPU stress workload to consume 80% of the resources.
  3. Upgrade using VUM remediate. Host1 started to go into maintenance mode. The Maintenance Mode “Operation timed out” in 30 minutes 
  4. Remediate process failed with “Cannot enter maintenance mode” in 30 minutes.
  5. Host1 entered host maintenance mode after its VM migrated to host2. It took greater than 2 hours for the vm migration to complete.
  6. Retrying the Remediation (after the migration) now worked fine and completed the upgrade process. 


Environment

VMware vSphere ESXi 7.0.0

Resolution

This is a known issue affecting VMware ESXi 7.0 Update 1.

Currently, there is no resolution.

Workaround:
To work around this issue, power off the VMs or migrate them manually before starting the host upgrade.