VM Fails to Power On and NVSWITCH Device Disappears from Host After PCI Passthrough Attempt
search cancel

VM Fails to Power On and NVSWITCH Device Disappears from Host After PCI Passthrough Attempt

book

Article ID: 391724

calendar_today

Updated On: 04-04-2025

Products

VMware vSphere ESXi

Issue/Introduction

When attempting to pass through NVSWITCH PCI devices to a virtual machine, users may experience a situation where the VM consistently fails to boot and the NVSWITCH device disappears from the ESXi host's PCI device list after each failed attempt. This forces administrators to reboot the host to restore visibility of the NVSWITCH device. The issue particularly impacts environments where NVIDIA GPUs and NVSWITCH devices are used together for high-performance computing applications.

Typically, after each VM power-on attempt, users observe that:
- The VM fails to power on with a "Module DevicePowerOn power on failed" error
- The NVSWITCH device that was assigned for passthrough is no longer visible in the host's hardware list
- Subsequent attempts to power on the VM fail with the same pattern, with more devices potentially disappearing
- A host reboot is required to restore visibility of the NVSWITCH devices

Other errors and log messages may include:

  • "PCIPassthru: Failed to get NumaNode for sbdf [device ID]"
  • "PCIPassthru: Selected device [device ID] is outside of the NUMA configuration"
  • "Failed to find a suitable device for pciPassthru0"

Environment

  • ESXi 8.0 and later
  • NVIDIA GPUs (H100, H200 or similar models) configured for passthrough
  • NVIDIA NVSwitch devices (if present)
  • Systems with hot-plug capable PCIe slots (particularly common in enterprise servers with PCIe switches)

Cause

The issue occurs because some PCIe slots are incorrectly configured for hotplug by the firmware, despite not having proper hardware support for out-of-band device presence detection. Instead, they use in-band presence detection, which is not officially supported by VMware or recommended by PCI-SIG.

During VM power-on, the PCI passthrough process attempts to reset the device, which triggers a link reset. When the link goes down during this reset, the hotplug-enabled slot incorrectly reports that the device has been removed. This results in a hotplug event on the host, causing the device to be removed from the system unexpectedly.

Resolution

The following work-around resolves this issue until a permanent fix is available in a future ESXi release:

Disable PCIe hotplug system-wide:

  1. Connect to your ESXi host via SSH

  2. Run the following command:
       esxcli system settings kernel set -s "enablePCIEHotplug" -v "FALSE"
  3. Reboot the ESXi host for the changes to take effect

  4. After the host reboots, attempt to power on the VM with the passthrough devices again

Note: This workaround will disable hot-plug capability for all PCIe devices on the host, meaning you won't be able to add or remove PCIe devices without rebooting the host. However, for environments affected by this issue, this limitation is typically acceptable as it resolves the critical problem with NVSWITCH devices disappearing.

Additional Information

  • When using NVSWITCH devices for NVLink functionality, all connected GPUs and NVSWITCH devices must be passed through to the same VM.

  • If you have NVIDIA NVSwitch devices, for proper NVLink functionality:

    1. Ensure all connected GPUs and NVSwitch devices are passed through to the same VM
    2. Add these advanced parameters to your VM configuration:
      • pciPassthru.allowP2P = "TRUE"
      • pciPassthru.use64bitMMIO = "TRUE"
      • pciPassthru.64bitMMIOSizeGB = "256" (increase if needed for multiple high-memory GPUs)

  • For H200 GPUs with large video memory (141GB each) and NVSwitches, you may need to set a larger MMIO space. For example, with 8 H200 GPUs (141GB × 8 = 1,128GB), set pciPassthru.64bitMMIOSizeGB = "2048" (next power of 2).

  • For some platforms, switching to vGPU mode rather than direct PCI passthrough may provide better stability and is the officially supported configuration by NVIDIA for NVLink functionality.

  • If problems persist after implementing this fix, check if there are BIOS settings to disable hot-plug capability for the PCIe slots in question.