Symptoms:
- ESXi host has many connection resets and All Paths Down (APD) or similar path down scenarios.
- ESXi 6.5 ,ESXi 6.7 or 7.0 host experiences PSOD with references to the FCoE module (qfle3f) in the backtrace.
PSOD: Panic bora/vmkernel/main/dlmalloc.c:4908 - Corruption in DLMALLOC referencing details ql_fcoe_delayed_wq.
- You see a backtrace similar to:
0x451b9fd9bd50:[0x418037d0ba15]PanicvPanicInt@vmkernel#nover+0x439 stack: 0x4302d004c490, 0x4180380a7558, 0x451b9fd9bdf8, 0x0, 0x100000001
0x451b9fd9bdf0:[0x418037d0bc48]Panic_NoSave@vmkernel#nover+0x4d stack: 0x451b9fd9be50, 0x451b9fd9be10, 0x43120f780c20, 0x4180380a7539, 0x132c
0x451b9fd9be50:[0x418037d54363]DLM_free@vmkernel#nover+0x6a8 stack: 0x43120f78acc0, 0x418037d51501, 0x5beea699da51a, 0x418037d15653, 0x0
0x451b9fd9be70:[0x418037d51500]Heap_Free@vmkernel#nover+0x115 stack: 0x0, 0x43120f78acc0, 0x2f, 0x40000000, 0x0
0x451b9fd9bec0:[0x418037c3d987]vmk_SpinlockDestroy@vmkernel#nover+0x48 stack: 0x43120f5df000, 0x418038ab09ed, 0x0, 0x418038abcb52, 0x43120f5df000
0x451b9fd9bee0:[0x418038ab09ec]DeleteFabric@(qfle3f)#<None>+0x29 stack: 0x43120f5df000, 0x43120f5df200, 0x0, 0x418038ab2c00, 0x43120f5f3610
0x451b9fd9bf40:[0x418038ab0bd9]_ReleaseFabricReference@(qfle3f)#<None>+0x2e stack: 0x43120f786000, 0x43120f786018, 0x1, 0x418038abc27b, 0x418038abc1f8
0x451b9fd9bf70:[0x418038abc27a]ql_fcoe_do_singlethread_work@(qfle3f)#<None>+0x83 stack: 0x2f, 0x418037d2902f, 0x2f, 0x418038abc1f8, 0x418037d2902a
0x451b9fd9bf90:[0x418037d2902e]vmkWorldFunc@vmkernel#nover+0x4f stack: 0x418037d2902a, 0x0, 0x451b8a6a3100, 0x451b9fda3000, 0x451b8a6a3100
0x451b9fd9bfe0:[0x418037f0e322]CpuSched_StartWorld@vmkernel#nover+0x77 stack: 0x0, 0x0, 0x0, 0x0, 0x0