PSOD on ESXi host @BlueScreen: #PF Exception 14 in world XXXXXX:ql_fcoe_dela
search cancel

PSOD on ESXi host @BlueScreen: #PF Exception 14 in world XXXXXX:ql_fcoe_dela

book

Article ID: 319993

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Remediation of the issue by means of updating drivers


Symptoms:

The ESXi host may go into a PSOD state with the back trace as below :  
 

[YYYY-MM-DDTHH:MM:SS] cpu25:2098138)@BlueScreen: #PF Exception 14 in world 2098138:ql_fcoe_dela IP 0x42002b40159c addr 0x128
PTEs:0x14f2fa023;0x14f2fb023;0x14f2fc023;0x0;
[YYYY-MM-DDTHH:MM:SS] cpu25:2098138)Code start: 0x42002a400000 VMK uptime: 82:21:43:01.789
[YYYY-MM-DDTHH:MM:SS] cpu25:2098138)0x4538d989bf28:[0x42002b40159c]CommandPumpOnPassiveLevel@(qedf)#<None>+0x0 stack: 0x43127a373000
[YYYY-MM-DDTHH:MM:SS] cpu25:2098138)0x4538d989bf30:[0x42002b3e684a]SendFCoEVlanSolicitation@(qedf)#<None>+0x353 stack: 0x43127a373018
[YYYY-MM-DDTHH:MM:SS] cpu25:2098138)0x4538d989bf50:[0x42002b3e7013]FipVlanTimeoutWork@(qedf)#<None>+0x15c stack: 0x43127a373018
[YYYY-MM-DDTHH:MM:SS] cpu25:2098138)0x4538d989bf70:[0x42002b3ff711]ql_fcoe_do_singlethread_work@(qedf)#<None>+0x76 stack: 0x43127a373000
[YYYY-MM-DDTHH:MM:SS] cpu25:2098138)0x4538d989bf90:[0x42002a51e224]vmkWorldFunc@vmkernel#nover+0x49 stack: 0x42002a51e220
[YYYY-MM-DDTHH:MM:SS] cpu25:2098138)0x4538d989bfe0:[0x42002a7b3b09]CpuSched_StartWorld@vmkernel#nover+0x86 stack: 0x0
[YYYY-MM-DDTHH:MM:SS] cpu25:2098138)0x4538d989c000:[0x42002a4c4d7f]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0

 

 

Environment

VMware vSphere ESXi 7.0

Cause

Checking with command "localcli storage core adapter list" may show devices that are using the qedf driver.
 

The vmkernel.log file may show entries similar to below lines:

 

[YYYY-MM-DDTHH:MM:SS] cpu26:2097871)qedf:vmhba0:qedfc_link_update_handler:1926:Info: ST(LINK): LINK_DOWN->LINK_UP
[YYYY-MM-DDTHH:MM:SS] cpu26:2097871)qedf:vmhba0:qedfc_link_update_handler:1897:Info: ST(LINK): LINK_UP->LINK_DOWN
[YYYY-MM-DDTHH:MM:SS] cpu26:2097871)qedf:vmhba0:qedfc_link_update_handler:1926:Info: ST(LINK): LINK_DOWN->LINK_UP
[YYYY-MM-DDTHH:MM:SS] cpu26:2097871)qedf:vmhba0:qedfc_link_update_handler:1897:Info: ST(LINK): LINK_UP->LINK_DOWN
[YYYY-MM-DDTHH:MM:SS] cpu26:2097871)qedf:vmhba0:qedfc_link_update_handler:1926:Info: ST(LINK): LINK_DOWN->LINK_UP
[YYYY-MM-DDTHH:MM:SS] cpu26:2097871)qedf:vmhba0:qedfc_link_update_handler:1897:Info: ST(LINK): LINK_UP->LINK_DOWN
[YYYY-MM-DDTHH:MM:SS] cpu26:2097871)qedf:vmhba0:qedfc_link_update_handler:1926:Info: ST(LINK): LINK_DOWN->LINK_UP
[YYYY-MM-DDTHH:MM:SS] cpu26:2097871)qedf:vmhba0:qedfc_link_update_handler:1897:Info: ST(LINK): LINK_UP->LINK_DOWN
[YYYY-MM-DDTHH:MM:SS] cpu26:2097871)qedf:vmhba0:qedfc_link_update_handler:1926:Info: ST(LINK): LINK_DOWN->LINK_UP

Resolution

Update the qedf driver to driver version 2.74.1.0-1OEM

Additional Information

Driver for 45000/41000 Series Adapters :

https://customerconnect.vmware.com/downloads/details?downloadGroup=DT-ESXI70-MARVELL-E4-CNA-DRIVER-BUNDLE-503820&productId=974

 

 

Checking the release notes for this driver we see that the issue has been fixed :

QLogic qedf VMware ESX Native Driver for ESXi 7.0/8.0
Copyright (c) 2015-2019 Cavium Inc.
Copyright (c) 2019-2020 Marvell Semiconductor, Inc.
All rights reserved
Version: 2.74.1.0
===========================
Enhancements:
-------------
- Update to qed-8.74.0.0 with storm fw 8.72.1.0
Fixes:
------
    * [FJT-9121]  : PSOD due to race condition between SendFCoEVlanSolicitation and
                    LogoutAllFabrics.
      Resolution  : Add mechanism of sync between SendFCoEVlanSolicitation and
                    LogoutAllFabrics.
      Scope       : 45000/41000 Series Adapters


PSOD due to race condition between SendFCoEVlanSolicitation and LogoutAllFabrics.

 

 


Impact/Risks:

Host goes into a PSOD state