When you attempt to migrate a VM with vGPU, the migration wizard fails due to compatibility issues such as:
"Currently connected device 'PCI device 0' uses backing 'nvidia_l40-4q', which is not accessible."
"A warning or error occurred when migrating the virtual machine. Virtual machine relocation, or power on after relocation or cloning can fail if vGPU resources are not available on the destination host."
You will see an error similar to:
YYYY-MM-DDTHH:MM:SS info vpxd[07827] [Originator@6876 sub=VmCheck] CompatCheck results: (vim.vm.check.Result) [
--> (vim.vm.check.Result) {
--> vm = 'vim.VirtualMachine:C8309A8A-722D-48F7-BA8C-A024A377C3B1:vm-354419',
--> host = 'vim.HostSystem:C8309A8A-722D-48F7-BA8C-A024A377C3B1:host-477101',
--> error = (vmodl.MethodFault) [
--> (vim.fault.InsufficientResourcesFault) {
--> faultMessage = (vmodl.LocalizableMessage) [
--> (vmodl.LocalizableMessage) {
--> key = "com.vmware.vim.vpxd.vmcheck.assignHwNotAvailable",
--> arg = (vmodl.KeyAnyValue) [
--> (vmodl.KeyAnyValue) {
--> key = "host",
--> value = "<ESXi name>"
--> },
--> (vmodl.KeyAnyValue) {
--> key = "vm",
--> value = "<VM-Name>"
--> },
--> (vmodl.KeyAnyValue) {
--> key = "missing",
--> value = "pciPassthru0"
--> }
--> ],
--> }
--> ],
--> msg = ""
--> }
--> ],
--> }
--> ]
No problems found in the GPU hardware and nvidia-smi command output looks good.
We are currently investigating the cause of this issue.
As a workaround, restart NVIDIA GPU Management Daemon on the affected host.
$ /etc/init.d/nvdGpuMgmtDaemon stop
$ /etc/init.d/nvdGpuMgmtDaemon start