ESXi 7.0.3 with Qlogic qedentv driver 3.70.7.0 may intermittently go unresponsive due to hostd service entering non-responsive state
book
Article ID: 317890
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
Symptoms:
Hostd service detected to be non-responsive.
Hosts lose connection from vCenter.
ESXCLI commands fail with a connection failed error.
Restarting management agents on Esxi host might not help
/var/run/log/vmkernel.log will show qedentv runs into infinite wait in RXQ destroy context on vmnic
qedentv prints the same log for hours, 1 line per second
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
Environment
VMware vSphere 7.0.x
Cause
This issue is due to a race condition in qedentv driver.
Resolution
The infinite loop while waiting for stats completion in fp mem free is resolved in qedentv driver 3.70.35.0. This newer driver adds in a timeout limit for stats and netpoll completion in fp mem free.
This issue has been fixed in latest relese of inbox qedentv driver for Esx in 8.0 For Async release driver details please contact the Hardware vendors
Workaround: Note: Do not proceed with the workaround if qedentv driver 3.70.35.0 is installed.
In qedentv driver 3.70.7.0, there is a module parameter (en_periodic_stats) that can be set to disable periodic stats and reboot the host to avoid this issue.