When multiple virtual machines with vGPU attached are running concurrently on a single host, the following warning message is frequently logged in the vmkernel.log file:
“WARNING: FDS: ###: Could not initialize AIO handles ########: No free handles”
Under these conditions, the following symptoms have been observed:
VMware vSphere ESX 7.0
VMware vSphere ESX 8.0
VMware ESX 9.0
ESXi can allocate up to 32,768 AIO handles for FDS on a system. When running numerous virtual machines with vGPU attached on a single host, the available AIO handles may become exhausted, resulting in an inability to open device files due to the lack of free AIO handles.
Under these conditions, the hostd service may crash when attempting to open a device file, hardware sensors may fail to be monitored if ESXi services cannot access the IPMI device file, and other instability symptoms may be observed.
This issue is resolved by updating to VMware vSphere ESX 8.0 Update3i and changing FDSNumAIOHandles kernel parameter to 65536.
After installing VMware vSphere ESX 8.0 Update3i, follow the steps below to change the FDSNumAIOHandles parameter.
Broadcom engineering team is working on a fix for VMware ESX 9.0.
Steps to change FDSNumAIOHandles:
FDSNumAIOHandles to 65536. esxcli system settings kernel set -s FDSNumAIOHandles -v 65536Example:# esxcli system settings kernel set -s FDSNumAIOHandles -v 65536
(No output is returned) esxcli system settings kernel list -o FDSNumAIOHandles
Example:# esxcli system settings kernel list -o FDSNumAIOHandles
Name Type Configured Runtime Default Description
---------------- ------ ---------- ------- ------- -----------FDSNumAIOHandles uint32 65536 65536 32768 Number of AIO handles that we expect LibAIO at the FDS level to dole out. (Range: 1 - 65536)Runtime column is changed to 65536.
Workaround:
In VMware vSphere ESX 8.0 Update3h or earlier, or VMware ESX 9.0, the issue can be avoided by either powering off the VMs that are consuming a large number of AIO handles or migrating them to a host with lower consumption. The amount of AIO handle usage can be confirmed using the vmkvsitools in ESX shell as below.
In the output of the following command line, VMs with large values in the first column are consuming a significant number of AIO handles. The second column indicates the process ID (VMX Cartel ID) of the VM.
vmkvsitools lsof | awk '$3=="CHAR" {print $0}' | grep vmgfx | awk '$2=="vmx" {print $1}' | sort | uniq -c | sort -k 1 -nExample:# vmkvsitools lsof | awk '$3=="CHAR" {print $0}' | grep vmgfx | awk '$2=="vmx" {print $1}' | sort | uniq -c | sort -k 1 -n<Value1> <VMX_Cartel_ID1><Value2> <VMX_Cartel_ID2><Value3> <VMX_Cartel_ID3>...
The VMX Cartel ID of VM can be identified using esxcli vm process list.
esxcli vm process listExample:# esxcli vm process list<VM Name>World ID: ########Process ID: ###VMX Cartel ID: #######UUID: ## ## ## ## ## ## ## ##-## ## ## ## ## ## ## ##Display Name: ####-########-####-####-####-############Config File: ##########################################...