ESXi 8.0 PSOD with IOMMU Fault Reason 0x79 involving NVIDIA Driver 580.126.08 and Management Daemon 580.65.05
search cancel

ESXi 8.0 PSOD with IOMMU Fault Reason 0x79 involving NVIDIA Driver 580.126.08 and Management Daemon 580.65.05

book

Article ID: 431541

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • ESXi 8.0 hosts may experience a recurring Purple Screen of Death (PSOD) when utilizing NVIDIA vGPU profiles. This occurs when the IOMMU detects an invalid memory translation entry.

  • The host crashes with a purple screen displaying the following signature:

    Log File Path: /var/run/log/vmkernel-zdump.log

YYYY-MM-DDT HH:MM:SS cpu#:#######)@BlueScreen: IOMMU Fault detected for (vmgfx#/nvidia-gpu) IOaddr: ############ Reason: 0x79 (Invalid Read/Write permission(R=W=0) for second-level paging entry) Domain: ############

Environment

VMware vSphere ESXi 8.0

GPU Driver: NVD-VMware_ESXi_8.0.0_Driver version 580.126.08-1OEM.800.1.0.20613240

Management Daemon: nvdgpumgmtdaemon version 580.65.05-1OEM.700.1.0.15843807

Cause

The NVIDIA 580.x driver series contains a logic error where memory pages assigned to the GPU are occasionally marked with "No Read" and "No Write" permissions (R=W=0) in the second-level page tables.

When the IOMMU hardware attempts to process a DMA (Direct Memory Access) request against these pages, it triggers a fault, leading to the ESXi host crash

Resolution

  • Verify the version of driver installed by running below command:
    1. esxcli software vib list | grep -i nvidia
    2. Confirm if both 580.126.08 and 580.65.05 are installed.
  • Consult the NVIDIA Support for a driver update (e.g., version 580.131 or newer) that addresses "IOMMU Fault 0x79" for ESXi 8.0