When attempting to attach an NVIDIA vGPU to a Linux virtual machine, the process fails with the error message: "The use of a virtual IOMMU is not supported on this virtual machine with a vGPU device." After resolving this initial error, the Linux VM may power on, but the nvidia-smi command fails with the error: "NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver."
VMware vSphere ESXi
The root cause of the initial error is a configuration conflict within the virtual machine. A parameter (vvtd.enable), which enables a virtual IOMMU for specific workloads, was set to TRUE. This configuration is not compatible with the vGPU architecture, which relies on the hypervisor to manage GPU resources.
The follow-on nvidia-smi error occurs because the virtual machine, while now correctly configured to accept the vGPU device, does not have the necessary NVIDIA vGPU software driver installed and loaded to communicate with the virtual hardware.
To resolve this follow these steps in order:
Fix the IOMMU Configuration Conflict
Power off the virtual machine.
In the vSphere Client, navigate to the VM's Edit Settings > VM Options > Advanced > Configuration Parameters.
Ensure the following parameters are either added or their values are set to FALSE:
vvtd.enable
pciPassthru.vmiop.allowViommu
pciPassthru.vmiop.enableViommu
Click OK to save the changes and then attempt to attach the vGPU device again.
The vvtd.enable parameter is typically set to TRUE for specific use cases such as nested virtualization or for VMs configured with more than 128 vCPUs, as referenced in Broadcom KB 313255. Applying this setting in a standard vGPU configuration creates an unsupported state.
The NVIDIA vGPU drivers are part of a licensed software suite and are not the same as standard public drivers. They must be downloaded from the NVIDIA Enterprise portal with a valid license and must be version-matched to the host's vGPU Manager.