Power-on VM fails with attached vGPU device "nvidia_l40s-24q"
search cancel

Power-on VM fails with attached vGPU device "nvidia_l40s-24q"

book

Article ID: 316307

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

NVIDIA VGPU VM fails to power on on ESXI 8.0.2 when using NVIDIA L40s card.


Symptoms:

  • Power-on VM fails with attached vGPU device "nvidia_l40s-24q" on ESXi 8.0.2
  • If vm is powered on without attaching the vGPU device, vm successfully powers on.
  • Facing issue only with L40S Cards, and Nvidia L40 cards work fine.
  • The customer has SR-IOV enabled on BIOS and vgpu-shared configured as per Broadcom KB - 322001 and 318514, but the issue still persists; restarting the xorg service does not help.
  • Nvidia memmapMaxRAMMB is already set to 1032192 as per https://docs.nvidia.com/grid/12.0/grid-vgpu-release-notes-vmware-vsphere/index.html , but the issue persists.
  • vmware.log For the vm shows below errors:
[YYYY-MM-DDTHH:MM:SS] In(05)+ vmx - Power on failure messages: Could not initialize plugin 'libnvidia-vgx.so' for vGPU 'nvidia_l40s-24q'.
[YYYY-MM-DDTHH:MM:SS] In(05)+ vmx - Module 'PCIPluginLate' power on failed.
[YYYY-MM-DDTHH:MM:SS] In(05)+ vmx - Failed to start the virtual machine.
[YYYY-MM-DDTHH:MM:SS] In(05)+ vmx -
[YYYY-MM-DDTHH:MM:SS] In(05) vmx - Vix: [mainDispatch.c:4210]: VMAutomation_ReportPowerOpFinished: statevar=0, newAppState=1870, success=1 additionalError=0

Environment

VMware vSphere ESXi 8.0.x

Cause

NVIDIA display mode configuration issue

Resolution

Need to disable NVIDIA display mode.

Steps to Disable Display Mode:

  1. Power on the virtual machine without attaching the vGPU device.
    This ensures the system boots without initializing the GPU display capabilities.

  2. Install the NVIDIA Display Mode Selector Tool on the guest operating system.
    You can obtain this tool from NVIDIA’s official documentation or driver package.

  3. Run the following command within the NVIDIA Display Mode Selector Tool

    displaymodeselector --gpumode physical_display_disabled

    This command disables the physical display mode on the vGPU.

  4. Reboot the virtual machine.
    A system restart is required for the changes to take effect.

  5. Attach the vGPU device

Additional Information