Hostd stops responding when SR-IOV enabled NICs fail to initialize due to interrupt vector exhaustion
search cancel

Hostd stops responding when SR-IOV enabled NICs fail to initialize due to interrupt vector exhaustion

book

Article ID: 334712

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

When you enable Single Root I/O Virtualization (SR-IOV) on the Network Interface Controllers (NICs) and reboot the ESXi host, the hostd might stop responding and the host might become unreachable through vCenter Server. Also, the command-line commands that depend on hostd (such as esxcli) might fail to execute and display an error message similar to the following:

Connect to localhost failed

You can see messages similar to the following logged in the vmkernel.log when the SR-IOV fails to initialize on a NIC:

YYYY-MM-DDT21:58:00.706Z cpu180:nnnnnnnnnn)WARNING: VMK_PCI: 705: device SSSS:BB:DD.F failed to allocate 3 MSIX interrupts
YYYY-MM-DDT21:58:00.720Z cpu180:nnnnnnnnnn)WARNING: LinPCI: LinuxPCI_EnableMSIX:883: SSSS:BB:DD.F: Interrupt allocation failed with Out of resources ...
YYYY-MM-DDT21:58:00.745Z cpu180:nnnnnnnnnn)<3>ixgbe SSSS:BB:DD.F: (unregistered net_device): MSI-X interrupt not available - disabling SR-IOV ...
YYYY-MM-DDT21:58:00.768Z cpu180:nnnnnnnnnn)PCI: 534: Disabling SR-IOV Mode for PF=SSSS:BB:DD.F

This issue occurs when the ESXi runs out of MSI-X interrupt vectors when initializing an SR-IOV enabled NIC..


Environment

VMware vSphere ESXi 5.5

Resolution

To workaround this issue, enable SR-IOV on least number of devices as possible.