ESXi 7.0.3 with Qlogic qedentv driver 3.70.7.0 may intermittently go unresponsive due to hostd service entering non-responsive state
search cancel

ESXi 7.0.3 with Qlogic qedentv driver 3.70.7.0 may intermittently go unresponsive due to hostd service entering non-responsive state

book

Article ID: 317890

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • Hostd service detected to be non-responsive.
  • Hosts lose connection from vCenter.
  • ESXCLI commands fail with a connection failed error.
  • Restarting management agents will not help.

>> vmkernel.log will show qedentv runs into infinite wait in RXQ destroy context on vmnic
qedentv prints the same log for hours, 1 line per second

[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.

[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.

[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.

[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.

[YYYY-MM-DDTHH:MM:SS] cpu40:2097436)[qedentv_free_mem_fp:2858(vmnic0)]Waiting for stats (fp:25) to complete.


Environment

VMware vSphere 7.0.x

Cause

Root cause is still unknown.

Resolution

The infinite loop while waiting for stats completion in fp mem free is resolved in qedentv driver 3.70.35.0. This newer driver adds in a timeout limit for stats and netpoll completion in fp mem free.


Workaround:
Note: Do not proceed with the workaround if qedentv driver 3.70.35.0 is installed.

In qedentv driver 3.70.7.0, there is a module parameter (en_periodic_stats) that can be set to disable periodic stats and reboot the host to avoid this issue.

Command the set the module parameter:
  1. SSH to the affected ESXi host
  2. Run the following command:
esxcfg-module -s 'en_periodic_stats=0' qedentv
  1. Reboot the host