ESXi Hosts become unresponsive and the vmkernel.log is filled with the message "Failed to open file 'hpilo-d0ccbXX"
search cancel

ESXi Hosts become unresponsive and the vmkernel.log is filled with the message "Failed to open file 'hpilo-d0ccbXX"

book

Article ID: 388283

calendar_today

Updated On:

Products

VMware vSphere ESXi 6.0 VMware vSphere ESXi 7.0

Issue/Introduction

The purpose of this KB is to explain why the error: Failed to open file 'hpilo-d0ccb##' is filling up the vmkernel logs at a rate higher than normal causing storage disconnections. 

  Symptoms:

  • You see the following error repeated in the vmkernel log multiple times per second
    • YYYY-MM-DDTHH:MM:SS.483Z cpu51:2108940)FSS: 7441: Failed to open file 'hpilo-d0ccb##'; Requested flags 0x5, world: 2108940 [sut], (Existing flags 0x0, world: 0 [unknown]): Busy
      YYYY-MM-DDTHH:MM:SS.496Z cpu51:2108940)ilo : Open could not dequeue a packet

This message is repeated multiple times per second. It is normal to see these messages every 30 seconds or so but if it is more frequent this can cause the storage to disconnect. As a result, we may trigger any of the following errors:

  • ESXi host goes in hung or unresponsive state.
  • VMs will continue to run in host memory but eventually they will show as invalid. 
  • You may also see the error: 
    • 503 Service Unavailable (Failed to connect to endpoint:.. 

Environment

VMware vSphere ESXi using HP Servers with HP ILO.

Cause

As per HPE documentation, HPE WBEM Provider will try to open a channel with HPE Integrated Lights-Out (iLO) every 30 seconds, 

"The message ... applies to all apps that use the hpilo driver - not just providers. It is a normal message that will also be seen with HPE Agentless Management Service (AMS), and HPE Smart Update Tool (iSUT) when trying to find an unused hpilo channel. It will be seen in all VMware ESXi versions, e.g, ESXi 6.5, 6.7 and 7.x that run the native versions of the hpilo driver. The message is logged by the operating system in the vmkernel.log when the ilo driver fails to open a command for a channel that is already in use"

Source: https://support.hpe.com/hpesc/public/docDisplay?docId=a00029796en_us&docLocale=en_US 


However, in this case, the frequency of logging is higher than expected i.e. around 45 log entries per second (3 per channel, 15 channels in total).

  • /var/log/vmkernel.log
    YYYY-MM-DDTHH:MM:SS.483Z cpu51:2108940)FSS: 7441: Failed to open file 'hpilo-d0ccb##'; Requested flags 0x5, world: 2108940 [sut], (Existing flags 0x0, world: 0 [unknown]): Busy
    YYYY-MM-DDTHH:MM:SS.496Z cpu51:2108940)ilo : Open could not dequeue a packet

    YYYY-MM-DDTHH:MM:SS.496Z cpu51:2108940)FSS: 7441: Failed to open file 'hpilo-d0ccb##'; Requested flags 0x5, world: 2108940 [sut], (Existing flags 0x0, world: 0 [unknown]): Busy
    YYYY-MM-DDTHH:MM:SS.509Z cpu51:2108940)ilo : Open could not dequeue a packet

    YYYY-MM-DDTHH:MM:SS.509Z cpu51:2108940)FSS: 7441: Failed to open file 'hpilo-d0ccb##'; Requested flags 0x5, world: 2108940 [sut], (Existing flags 0x0, world: 0 [unknown]): Busy
    YYYY-MM-DDTHH:MM:SS.521Z cpu51:2108940)ilo : Open could not dequeue a packet

    • Note: By design, logging of 15 iLO channels numerates in chorological order. viz hpilo-d0ccb01, hpilo-d0ccb02. till hpilo-d0ccb15. The probe repeats once one cycle of all 15 channels completes.

Generally these messages can be ignored unless it is repeating multiple times per second. This will clog up I/O causing storage disconnections and VM latency or failures. 

Resolution

We recommend you update your HPE drivers and firmware to the latest compatible version and reboot the host

For more information on this issue, it is recommended to contact HPE hardware support.