lspci) but do not appear correctly in the vCenter Server hardware inventory"Operation failed! An error occurred during host configuration: Operation failed, diagnostics report: GetDeviceID failed."/var/run/log/hostd.log on the affected host contains the following entriesYYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: [Originator@6876 sub=Hostsvc opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] Config wants to enable passthrough for ####:##:##:# YYYY-MM-DDTHH:MM:SS Wa(164) Hostd[#####]: [Originator@6876 sub=Libs opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] VmkCtl: GetDeviceID failed for ####:##:##:# Device-ID not found!YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: [Originator@6876 sub=Hostsvc opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] Re-fetching all PCI devices since SR-IOV configuration has been updatedYYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: [Originator@6876 sub=Hostsvc.AssignableHardwareProvider opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] AH Generating NVIDIA VDG device complex specYYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: [Originator@6876 sub=Libs opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] NvidiaVgpuInfo: Failed to open nvidia libraryYYYY-MM-DDTHH:MM:SS Wa(164) Hostd[#####]: [Originator@6876 sub=Libs opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] NvidiaDeviceGroupInfo: vgpuInfo not available.YYYY-MM-DDTHH:MM:SS Wa(164) Hostd[#####]: [Originator@6876 sub=Hostsvc.AssignableHardwareProvider opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] AH Device complex generator directory path /usr/lib/vmware/vdg/bin doesn't exist or is not a directoryYYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: [Originator@6876 sub=Hostsvc.AssignableHardwareProvider opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] AH Dtree deviceGroup identical (devices/types): pci: 8/1 QAT: 0/0 VDG: 0/0YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: [Originator@6876 sub=Hostsvc opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] Populating NUMA PCI ids ...YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: [Originator@6876 sub=AdapterServer opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] AdapterServer caught exception; <<52####d9-a##0-1##9-1##d-02###61d15, <TCP '127.0.0.1 : 8307'>, <TCP '127.0.0.1 : 51549'>>, ha-pcipassthrusystem, vim.host.PciPassthruSystem.updatePassthruConfig,<vim.version.v8_0 internal, 8.0.3.0>, [N11HostdCommon18VmomiAdapterServer19ActivationResponderE:0x00000090abc6a3f8]>, N7Hostsvc21HaPlatformConfigFault9ExceptionE(Fault cause: vim.fault.PlatformConfigFault
YYYY-MM-DDTHH:MM:SS Db(167) Hostd[#####]: [Originator@6876 sub=Solo.Vmomi opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] Arg config:YYYY-MM-DDTHH:MM:SS Db(167) Hostd[#####]: --> (vim.host.PciPassthruConfig) [YYYY-MM-DDTHH:MM:SS Db(167) Hostd[#####]: --> (vim.host.PciPassthruConfig) {YYYY-MM-DDTHH:MM:SS Db(167) Hostd[#####]: --> id = "####:##:##:#",YYYY-MM-DDTHH:MM:SS Db(167) Hostd[#####]: --> passthruEnabled = true,YYYY-MM-DDTHH:MM:SS Db(167) Hostd[#####]: --> }YYYY-MM-DDTHH:MM:SS Db(167) Hostd[#####]: --> ]YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: [Originator@6876 sub=Solo.Vmomi opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] Throw vim.fault.PlatformConfigFaultYYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: [Originator@6876 sub=Solo.Vmomi opID=mj###ujq-20###481-auto-##r7-h5:720####46-82-f9-c##d sid=52####2d9 user=vpxuser:DOMAIN\USER] Result:YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> (vim.fault.PlatformConfigFault) {YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> faultMessage = (vmodl.LocalizableMessage) [YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> (vmodl.LocalizableMessage) {YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> key = "com.vmware.esx.hostctl.default",YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> arg = (vmodl.KeyAnyValue) [YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> (vmodl.KeyAnyValue) {YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> key = "reason",YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> value = "GetDeviceID failed."YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> }YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> ],YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> }YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> ],YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> text = "",YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> msg = ""YYYY-MM-DDTHH:MM:SS In(166) Hostd[#####]: --> }
This issue occurs because certain enterprise server PCIe slots are incorrectly configured by firmware for hotplug support without having the necessary hardware for out-of-band presence detection. When vCenter attempts to initialize or reset the NVSwitch during passthrough configuration, the link reset triggers a false "device removed" hotplug event. This causes the host to lose track of the device ID during the configuration task, resulting in the GetDeviceID failed error.
To resolve this issue, disable the native PCIe hotplug interrupt and adjust the passthrough mapping for NVIDIA Hopper-series GPUs.
esxcli system settings kernel set -s "enablePCIEHotplug" -v "FALSE"