VM with vGPU and large amount of memory becomes unresponsive during vmotion
search cancel

VM with vGPU and large amount of memory becomes unresponsive during vmotion

book

Article ID: 312395

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:
- vMotion of a VM with a supported vGPU and a large amount of Memory (example 120 GB)
- VM becomes unresponsive at approximately 70% for 45+ seconds depending on active workload on VM
- Ping response time on an impacted VM does increase but does respond.
- vMotion is eventually successful.
- A VM with supported vGPU and a smaller amount of memory (example 8GB) does not experience this problem.
- An error message while trying to migrate VM with high memory (example 130 GB) and vGPU :
"PCI Plugin reported error during migrate checkpoint write for device pciPassthru0"

Environment

VMware vCenter Server 7.0.3
VMware vCenter Server Appliance 6.7.x
VMware vCenter Server 6.7.x

Cause

VMware is aware of sub-optimal memory pinning performance in certain scenarios like those described above,  and are currently working to improve this in a future release.

Resolution

Currently there is no resolution for this behaviour.
.

Workaround:
Currently there is no workaround for this behaviour

If you have VM's with GPU's and a large amount of memory
- migrate the VM during times of low to no workload on the VM.
- If a critical VM and the impact of the stun time would be unacceptable for you, and it is a member of DRS enabled cluster where the automation level is 'Fully Automated' - consider setting a custom automation for this VM to 'Partially Automated'
- Reboot the VM to free the memory buffer.

Additional Information