NVAIE 7.x (vGPU 580.x.x) Guest Driver Download fails in DLVM 9.0.1.0
search cancel

NVAIE 7.x (vGPU 580.x.x) Guest Driver Download fails in DLVM 9.0.1.0

book

Article ID: 425067

calendar_today

Updated On:

Products

VCF Private AI Services

Issue/Introduction

In environments where NVIDIA NVAIE 7.x (corresponding to the vGPU 580 guest driver branch) is already installed on the ESXi host, DLVM 9.0.1.0 will fail to download vGPU guest driver with the following error in /var/log/vgpu-install.log:

VERSION="9.0.1.0"
BUILD_NUMBER="24882593"
vGPU host driver version detected: 580.105.06
...
2025-12-18 14:26:29,309 - urllib3.connectionpool - _new_conn[DEBUG]: Starting new HTTPS connection (1): api.ngc.nvidia.com:443
2025-12-18 14:26:30,285 - urllib3.connectionpool - _make_request[DEBUG]: https://api.ngc.nvidia.com:443 "GET /v2/org/nvidia/team/vgpu/resources/vgpu-guest-driver-7/versions/7.3/files/NVIDIA-Linux-x86_64-580.105.08-grid.run HTTP/1.1" 403 111
2025-12-18 14:26:30,372 - dlvminit.logging_config - error[ERROR]: ERROR Download failed: 403 Client Error: Forbidden for url: https://api.ngc.nvidia.com/v2/org/nvidia/team/vgpu/resources/vgpu-guest-driver-7/versions/7.3/files/NVIDIA-Linux-x86_64-580.105.08-grid.run

Environment

VMware Cloud Foundation 9.0.0.0

VMware Cloud Foundation 9.0.1.0

VMware Private AI Foundation 9.0.1.0 (DLVM 9.0.1.0)

Cause

Starting with the NVAIE 7.0 / vGPU 580 branch, NVIDIA introduced a breaking change to the resource naming convention for guest drivers hosted on the NGC catalog: 

  • Legacy Resource Name (NVAIE 6.x and earlier): vgpu-guest-driver
  • New Resource Name (NVAIE 7.x / vGPU 580.x.x): vgpu-for-compute-guest-driver

The internal deployment logic of DLVM 9.0.1.0 uses hardcoded strings that match the legacy naming convention. When the DLVM attempts to fetch a 580-series guest driver using the outdated resource path, the NGC API returns a 403 Forbidden error.

Resolution

Before proceeding, check if a version newer than DLVM 9.0.1.0 is available. If a more recent version has been released, it is recommended to use that version as it contains the fix for the NVIDIA guest driver naming convention.

For environments that run DLVM 9.0.1.0, to work around the issue, manually overriding the resource name mapping using a cloud-init script during the deployment process.

Step 1: Prepare the Cloud-Init Configuration

Use the following YAML configuration to patch the internal constants.py file within the DLVM. This patch redirects BASE_GUEST_DRIVER_URL to the correct vgpu-for-compute-guest-driver resource.

#cloud-config

write_files:

- encoding: b64

  path: /opt/dlvm/dlvminit/constants.py

  owner: root:root

  permissions: '0755'

  content: IyBDb3B5cmlnaHQgKGMpIDIwMjUgQnJvYWRjb20uIEFsbCBSaWdodHMgUmVzZXJ2ZWQuDQojIEJyb2FkY29tIENvbmZpZGVudGlhbC4gVGhlIHRlcm0gIkJyb2FkY29tIiByZWZlcnMgdG8gQnJvYWRjb20gSW5jLg0KIyBhbmQvb3IgaXRzIHN1YnNpZGlhcmllcy4NCg0KZnJvbSBlbnVtIGltcG9ydCBFbnVtDQoNCiMgT1ZGIFByb3BlcnR5IEtleXMNCk9WRl9QUk9QRVJUSUVTID0gew0KICAgICJJTlNUQU5DRV9JRCI6ICJpbnN0YW5jZS1pZCIsDQogICAgIkhPU1ROQU1FIjogImhvc3RuYW1lIiwNCiAgICAiU0VFREZST00iOiAic2VlZGZyb20iLA0KICAgICJQVUJMSUNfS0VZUyI6ICJwdWJsaWMta2V5cyIsDQogICAgIlVTRVJfREFUQSI6ICJ1c2VyLWRhdGEiLA0KICAgICJQQVNTV09SRCI6ICJwYXNzd29yZCIsDQogICAgIlZHUFVfTElDRU5TRSI6ICJ2Z3B1LWxpY2Vuc2UiLA0KICAgICJOR0NfQVBJX0tFWSI6ICJuZ2MtYXBpLWtleSIsDQogICAgIk5WSURJQV9QT1JUQUxfQVBJX0tFWSI6ICJudmlkaWEtcG9ydGFsLWFwaS1rZXkiLA0KICAgICJWR1BVX0hPU1RfRFJJVkVSX1ZFUlNJT04iOiAidmdwdS1ob3N0LWRyaXZlci12ZXJzaW9uIiwNCiAgICAiVkdQVV9VUkwiOiAidmdwdS11cmwiLA0KICAgICJSRUdJU1RSWV9VUkkiOiAicmVnaXN0cnktdXJpIiwNCiAgICAiUkVHSVNUUllfVVNFUiI6ICJyZWdpc3RyeS11c2VyIiwNCiAgICAiUkVHSVNUUllfUEFTU1dEIjogInJlZ2lzdHJ5LXBhc3N3ZCIsDQogICAgIlNFQ09OREFSWV9SRUdJU1RSWV9VUkkiOiAicmVnaXN0cnktMi11cmkiLA0KICAgICJTRUNPTkRBUllfUkVHSVNUUllfVVNFUiI6ICJyZWdpc3RyeS0yLXVzZXIiLA0KICAgICJTRUNPTkRBUllfUkVHSVNUUllfUEFTU1dEIjogInJlZ2lzdHJ5LTItcGFzc3dkIiwNCiAgICAiSU1BR0VfT05FTElORVIiOiAiaW1hZ2Utb25lbGluZXIiLA0KICAgICJET0NLRVJfQ09NUE9TRV9VUkkiOiAiZG9ja2VyLWNvbXBvc2UtdXJpIiwNCiAgICAiQ09ORklHX0pTT04iOiAiY29uZmlnLWpzb24iLA0KICAgICJDT05EQV9FTlZJUk9OTUVOVF9JTlNUQUxMIjogImNvbmRhLWVudmlyb25tZW50LWluc3RhbGwiLA0KICAgICJETFZNQ09OU09MRV9FTkFCTEUiOiAiZGx2bWNvbnNvbGUtZW5hYmxlIiwNCn0NCg0KRU5WSVJPTk1FTlRTX1ZBUlMgPSB7DQogICAgIyBleHBvcnQgRExWTV9ERUJVRz10cnVlIGZvciBkZWJ1ZyBtb2RlDQogICAgIkRFQlVHX01PREUiOiAiRExWTV9ERUJVRyIsDQp9DQoNCk5HQ19WR1BVX0RSSVZFUl9DQVRBTE9HX1VSTCA9ICJodHRwczovL2FwaS5uZ2MubnZpZGlhLmNvbS92Mi9yZXNvdXJjZXMvb3JnL252aWRpYS90ZWFtL3ZncHUvdmdwdV9kcml2ZXJfY2F0YWxvZy9sYXRlc3QvZmlsZXM/cmVkaXJlY3Q9dHJ1ZSZwYXRoPXZncHVEcml2ZXJDYXRhbG9nLnlhbWwiDQpCQVNFX0dVRVNUX0RSSVZFUl9VUkwgPSAoDQogICAgImh0dHBzOi8vYXBpLm5nYy5udmlkaWEuY29tL3YyL29yZy9udmlkaWEvdGVhbS92Z3B1L3Jlc291cmNlcy8iDQogICAgInZncHUtZm9yLWNvbXB1dGUtZ3Vlc3QtZHJpdmVyLXttYWpvcl9udmFpZV9yZWxlYXNlfS92ZXJzaW9ucy97bnZhaWVfcmVsZWFzZX0vZmlsZXMvIg0KICAgICJOVklESUEtTGludXgteDg2XzY0LXtndWVzdF9kcml2ZXJfdmVyc2lvbn0tZ3JpZC5ydW4iDQopDQpOVklESUFfUFJPRFVDVCA9ICdudmFpZScNCkhPU1RfVFlQRSA9ICJob3N0Ig0KR1VFU1RfVFlQRSA9ICJndWVzdCINCkRSSVZFUl9NQVRDSF9SVUxFUyA9IHsNCiAgICAnaG9zdCc6IFsNCiAgICAgICAgeyd0eXBlJzogJ2hvc3QnfSwNCiAgICAgICAgeydoeXBlcnZpc29yJzogJ0VTWGknfQ0KICAgIF0sDQogICAgJ2d1ZXN0JzogWw0KICAgICAgICB7J3R5cGUnOiAnZ3Vlc3QnfSwNCiAgICAgICAgeydvcyc6ICdMaW51eCd9DQogICAgXQ0KfQ0KIyBNYWdpYyBzdHJpbmcgY29uc3RhbnRzIGZvciB2Z3B1IGRyaXZlciBjYXRhbG9nDQpjbGFzcyBDYXRhbG9nS2V5czoNCiAgICBEUklWRVIgPSAiZHJpdmVyIg0KICAgIEJSQU5DSCA9ICJicmFuY2giDQogICAgUFJPRFVDVCA9ICJwcm9kdWN0Ig0KICAgIFJFTEVBU0UgPSAicmVsZWFzZSINCiAgICBWRVJTSU9OID0gInZlcnNpb24iDQogICAgVFlQRSA9ICJ0eXBlIg0KICAgIE5BTUUgPSAibmFtZSINCiAgICBBTExPVyA9ICJhbGxvdyINCg0KTE9DQUxfTUFOSUZFU1RfRklMRSA9ICIvb3B0L2Rsdm0vZGx2bWluaXQvdmdwdS1uZ2MtbWFuaWZlc3QudHh0Ig0KDQpHdWVzdEJvb3RzdHJhcF9LRVkgPSAiZ3Vlc3RpbmZvLnZtc2VydmljZS5ib290c3RyYXAuY29uZGl0aW9uIg0KDQpQUk9YWV9FTlZTID0gWw0KICAgICdodHRwX3Byb3h5JywgJ0hUVFBfUFJPWFknLCANCiAgICAnaHR0cHNfcHJveHknLCAnSFRUUFNfUFJPWFknLCANCiAgICAnZnRwX3Byb3h5JywgJ0ZUUF9QUk9YWScsIA0KICAgICdub19wcm94eScsICdOT19QUk9YWScsIA0KICAgICdhbGxfcHJveHknLCAnQUxMX1BST1hZJywNCl0NCg0KU1lTVEVNX1VTRVIgPSAidm13YXJlIg0KDQpjbGFzcyBIYXNoQWxnb3JpdGhtKEVudW0pOg0KICAgIE1ENSA9ICdtZDUnDQogICAgU0hBMSA9ICdzaGExJw0KICAgIFNIQTIyNCA9ICdzaGEyMjQnDQogICAgU0hBMjU2ID0gJ3NoYTI1NicNCiAgICBTSEEzODQgPSAnc2hhMzg0Jw0KICAgIFNIQTUxMiA9ICdzaGE1MTInDQoNCkxPR0dBQkxFX0ZJTEVfUEFUSFMgPSBbDQogICAgIi9vcHQvZGx2bS9lbnZpcm9ubWVudCIsDQogICAgIi9ldGMvc3lzdGVtZC9zeXN0ZW0vcmMtbG9jYWwuc2VydmljZS5kL2Vudmlyb25tZW50LmNvbmYiLA0KICAgICIvZXRjL3N5c3RlbWQvc3lzdGVtL2RvY2tlci5zZXJ2aWNlLmQvcHJveHkuY29uZiIsDQogICAgIi9ldGMvbnZpZGlhL2dyaWRkLmNvbmYiDQpdDQoNCkRFRkFVTFRfQ0FfQlVORExFID0gJy9ldGMvc3NsL2NlcnRzL2NhLWNlcnRpZmljYXRlcy5jcnQnDQpCQVNFX1RNUF9ESVIgPSAnL29wdC9kbHZtL3RtcCc=

Step 2: Applying the Workaround

To apply the fix, follow these steps to inject the configuration into your deployment environment:

  1. If deploying DLVM via vCenter Server, 
    • Copy the YAML code block from Step 1
    • Base64 encode the entire YAML block
    • Paste the resulting Base64 string into the 'Encoded user-data'field in the DLVM OVF parameter
  2. If deploying DLVM via VMService,
    • Copy the YAML code block from Step 1
    • Base64 encode the entire YAML block
    • Paste the resulting Base64 string into the 'user-data'field in the DLVM OVF parameter
  3. If If deploying DLVM via VCFA,
    • Check the 'Custom cloud-init'
    • Paste the raw YAML script (from Step 1) directly into the provided text area.