vSphere Lifecycle Manager is unable to place a host into maintenance mode if it has a running vGPU Virtual Machine.
“PRE-CHECK” gives the below message:
Virtual machine ... that runs on host '...' reported an issue which prevents entering maintenance mode...
“REMEDIATE ALL” gives the below error:
Entering maintenance mode failed for host ...
For vSphere Lifecycle Manager with vCenter 8.0 and newer, a VI Admin can enable DRS automation using the DRS Cluster Advanced Option described in vGPU Virtual Machine automated migration for Host Maintenance Mode in a DRS Cluster.
Remarks:
There must be spare vGPU host capacity in the Cluster for running vGPU Virtual Machines to migrate to during upgrade.
It is recommended to place vGPU Virtual Machines on shared storage to expedite migration.
Refer to the vendor's compatibility matrix for host driver upgrades.
Upgrades of hosts with running vGPU Virtual Machines using vSphere Update Manager OR vSphere Lifecycle Manager with vSphere 7.x and older, is NOT supported. A VI Admin can use the following Migration options to allow for Host Maintenance Mode:
vMotion the vGPU Virtual Machines
Suspend the vGPU Virtual Machines
Power Off the vGPU Virtual Machines
For Virtual Machines that exceed the vMotion timeout, there are the following options to allow for Host Maintenance Mode:
Suspend the vGPU Virtual Machines
Power Off the vGPU Virtual Machines
Adjust the vMotion timeout to account for the Virtual Machine's Expected Worst-Case Stun Time.
vMotion Timeout: vMotion or Storage vMotion of a VM fails with the error: The migration has exceeded the maximum switchover time of 100 second(s).
Estimated Worst-Case Stun Times: Virtual Machine Conditions and Limitations for vSphere vMotion