When you try to enable SR-IOV for the GPU card you see following in ESX CLI
[root@ESXi:~] nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
In host logs you see following
All logs are located in /var/run/log
hostd.log
YYYY-MM-DDTHH:MM:SS.SSSSZ In(166) Hostd[2100012] [Originator@6876 sub=Libs] NvmlUser: nvmlInit error code: 28
YYYY-MM-DDTHH:MM:SS.SSSSZ In(166) Hostd[2100012] [Originator@6876 sub=Libs] NvidiaVgpuInfo: Failed to open nvidia library
YYYY-MM-DDTHH:MM:SS.SSSSZ Wa(164) Hostd[2100012] [Originator@6876 sub=Libs] NvidiaDeviceGroupInfo: vgpuInfo not available.
vmkwarning.log
YYYY-MM-DDTHH:MM:SS.SSSSZ Db(15) esxupdate[2107898] Output: NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
In syslog I see this
YYYY-MM-DDTHH:MM:SS.SSSSZ Er(11) nvidia-vgpud[2100477] error: failed to allocate client: 59
YYYY-MM-DDTHH:MM:SS.SSSSZ Er(11) nvidia-vgpud[2100477] error: failed to read pGPU information: 9
YYYY-MM-DDTHH:MM:SS.SSSSZ Er(11) nvidia-vgpud[2100477] error: failed to send vGPU configuration info to RM: 9
vSphere 8.x
Incorrect driver version was causing this behavior.
For obtaining correct drivers for your GPU please contact NVidia.