ESXi host experiences a Purple Diagnostic Screen referencing __lpfc_sli_get_iocbq
book
Article ID: 317639
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
Symptoms: On an ESXi host that has Emulex FCoE HBAs installed and HPE Agentless Management Service (AMS) up and running after 50 days of the driver being loaded, you experience these symptoms:
The ESXi host fails with a Purple Diagnostic Screen.
In the vmkernel.log file of the ESXi host, you see these brcmfcoe warnings prior to the PSOD similar to:
2018-12-16T17:09:34.805Z cpu44:68685)WARNING: brcmfcoe: lpfc_sli_issue_iocb_wait:10765: 0:0330 IOCB wake NOT set, Data x24 x0
Note: The preceding log excerpts are only examples. The date, time, and environmental variables may vary depending on your environment.
Cause
The HPE AMS tool periodically sends management commands to the HBA driver. In certain situations, the driver would not receive a response to the management command, and that would result in the driver not cleaning up the command resources used to process the response properly.
This would lead to the driver running out of management command resources. Any further management commands sent to the HBA driver may give a NULL address because of lack of resources and will cause a NULL references hence causing the ESXi to PSOD.
Resolution
This issue is resolved in brcmfcoe 12.0.1278.0 async driver or later versions.
Upgrading to these versions of ESXi upgrades the brcmfcoe driver version to 12.0.1278.0 or later. This is a known issue affecting VMware ESXi 6.0.x. Currently, there is no resolution for this version of ESXi.