"Failed to Power On" Virtual Machines with PCI Passthrough and Insufficient MMIO Allocation
book
Article ID: 323402
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
When configuring PCI passthrough for a GPU within a virtual machine (VM), proper memory-mapped I/O (MMIO) allocation is essential for successful VM boot and operation.
MMIO, a fundamental aspect of the PCI specification, facilitates direct access to I/O devices by mapping them into the system's memory space. This approach eliminates the need for dedicated I/O ports, allowing the CPU to interact with devices using standard memory access instructions.
Specifically, for GPU passthrough, MMIO is critical to map the GPU's framebuffer memory to the VM's memory space. This enables the CPU to efficiently transfer data between the CPU and GPU, facilitating proper graphics rendering and overall VM performance.
Calculating MMIO Value:
The MMIO value is determined by a simple calculation based on the total framebuffer memory allocated to the VM's GPUs. To ensure compatibility, this value must be a power of 2.
Powers of 2 are 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024 …
Example: One NVIDIA H100 PCIe 80GB = 80GB (in between 64GB and 128GB), so round up to the next power of 2 (128GB), then round up again to the next power of 2 after that (256GB) to get the correct setting. If it is set too low the VM won’t boot.
Error "Module DevicePowerOn power on failed" when powering on a VM with PCI device attached.
The VM dropped after the "Power on virtual machine" task completed.
In the vmware.log file (/vmfs/volumes/datastore/vmdirectory/vmware.log) there are entries similar to:
YYYY-MM-DDTHH:MM:SS In(05) vcpu-0 - PCIPassthru: successfully created the IOMMU mappings YYYY-MM-DDTHH:MM:SS In(05) vcpu-0 - Guest: EFI ROM version: VMW71.00V.21100432.B64.2301110304 (64-bit RELEASE) YYYY-MM-DDTHH:MM:SS In(05) vcpu-0 - BIOS-UUID is 42 3e 61 c2 32 fc f5 37-1c 79 d0 ee 3c 29 e2 4a YYYY-MM-DDTHH:MM:SS In(05) vcpu-0 - Msg_Post: Error YYYY-MM-DDTHH:MM:SS In(05) vcpu-0 - [msg.efi.pciMmioError] The firmware could not allocate xxxxxxx KB of PCI MMIO. Increase the size of PCI MMIO and try again.
Cause
This issue usually happens if there is not enough allocated MMIO space to the VM for all the GPUs.
Resolution
To solve the issue, it is required to set/configure MMIO (Memory-mapped I/O) parameters for the VM:
Log in to vCenter Server and navigate to the VM's settings.
Under the VM settings, select "VM Options" > "Advanced" > "Edit Configuration."
Once on the Configuration parameters screen, add two more parameters: