ESXi 7.0.3 with Qlogic qedentv driver 3.70.7.0 may intermittently go unresponsive due to hostd service entering non-responsive state
book
Article ID: 317890
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
Symptoms:
Hostd service detected to be non-responsive.
Hosts lose connection from vCenter.
ESXCLI commands fail with a connection failed error.
Restarting management agents will not help.
>> vmkernel.log will show qedentv runs into infinite wait in RXQ destroy context on vmnic qedentv prints the same log for hours, 1 line per second
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.
Environment
VMware vSphere 7.0.x
Cause
Root cause is still unknown.
Resolution
The infinite loop while waiting for stats completion in fp mem free is resolved in qedentv driver 3.70.35.0. This newer driver adds in a timeout limit for stats and netpoll completion in fp mem free.
Workaround: Note: Do not proceed with the workaround if qedentv driver 3.70.35.0 is installed.
In qedentv driver 3.70.7.0, there is a module parameter (en_periodic_stats) that can be set to disable periodic stats and reboot the host to avoid this issue.