Virtual machine not utilizing attached NVIDIA GPUs despite hardware visibility in the guest OS
search cancel

Virtual machine not utilizing attached NVIDIA GPUs despite hardware visibility in the guest OS

book

Article ID: 435335

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • A virtual machine configured with NVIDIA GPUs successfully detects the hardware (e.g., the device is visible via lspci in Linux or Device Manager in Windows).
  • However, the Guest OS or specific applications fail to utilize the GPU for processing tasks under load

Environment

VMware vSphere 

Cause

In a vSphere environment, the hypervisor's role is to present the GPU hardware or vGPU profile to the virtual machine. Once the Guest OS identifies the PCI device, the vSphere layer has fulfilled its requirement. The actual allocation, scheduling, and utilization of the GPU are controlled entirely by the Guest OS and the internal NVIDIA drivers. 

Resolution

Because the vSphere layer is functioning as designed by presenting the hardware, further troubleshooting must be performed within the guest software stack.

  1. Host-Level Verification: Confirm that the NVIDIA VIBs are correctly installed on the ESXi host and that the GPU is not experiencing hardware-level errors at the host layer. Refer to Installing and configuring the NVIDIA VIB on ESXi (367541) for host-side validation.
  2. 3rd-Party Vendor Engagement: If hardware visibility is confirmed in the guest but utilization is 0%, engage the Guest OS vendor or NVIDIA support for further investigation into driver behavior and application resource mapping

Additional Information

For instructions on verifying or configuring the host-level setup, refer to: Installing and configuring the NVIDIA VIB on ESXi (367541).