PSOD Memory Trace:
YYYY-MM-DDTHH:MM:SS.181Z cpu74:2097226)1 other PCPU is in panic.
YYYY-MM-DDTHH:MM:SS.072Z cpu74:2097226)NMI: 738: NMI IPI: PC 0x42003233526e, SP 0x453b0251b8e8 (Src 0x1, CPU74)
YYYY-MM-DDTHH:MM:SS.070Z cpu74:2097226)NMI: 738: NMI IPI: PC 0x42003116ec32, SP 0x453b0251bac8 (Src 0x1, CPU74)
YYYY-MM-DDTHH:MM:SS.872Z cpu204:2098892)Jumpstart plugin restore-nfs-volumes activation failed.
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)Backtrace for current CPU #74, worldID=2097226, fp=0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x452ac04a2d30:[0x42003117be80]PanicvPanicInt@vmkernel#nover+0x20c stack: 0x3734, 0x42003117be80, 0x0, 0x420000000001, 0x42003117be80
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x452ac04a2de0:[0x42003117c656]Panic_WithBacktrace@vmkernel#nover+0x57 stack: 0x452ac04a2e50, 0x452ac04a2e00, 0x453b0251f000, 0x452ac04a2eaf, 0x42003233526e
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x452ac04a2e50:[0x420031178561]NMI_Interrupt@vmkernel#nover+0x516 stack: 0xf6e5c8b2faa5c8b6, 0xcdd55a12c1955a16, 0xcedded30c29ded34, 0xf5ed7f90f9ad7f94, 0x4840c39f4400c39b
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x452ac04a2f10:[0x4200316a6404]IDTNMIWork@vmkernel#nover+0x95 stack: 0x0, 0x4200316a786d, 0x0, 0x4200316a10c7, 0x750
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x452ac04a2f30:[0x4200316a786c]Int2_NMI@vmkernel#nover+0x9 stack: 0x750, 0x750, 0x0, 0x8, 0x24
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x452ac04a2f40:[0x4200316a10c6]gate_entry@vmkernel#nover+0xa7 stack: 0x0, 0x8, 0x24, 0x453b0251b960, 0x33
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251b8e8:[0x42003233526e]ehci_filter@(vmkusb)#<None>+0x26 stack: 0x33, 0x4303c6f04eb0, 0x420031161591, 0x0, 0x4303c6e03a00
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251b8f0:[0x4200322b093a][email protected]#1+0xf stack: 0x4303c6f04eb0, 0x420031161591, 0x0, 0x4303c6e03a00, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251b910:[0x420031161590]IntrCookie_DoInterrupt@vmkernel#nover+0x5a1 stack: 0x0, 0x1980, 0x453b0251ba00, 0x100001980, 0x33
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251b9c0:[0x420031161693]IntrCookie_VmkernelInterrupt@vmkernel#nover+0x38 stack: 0x4d, 0x4200316a890b, 0x0, 0x0, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251b9e0:[0x4200316a890a]IDT_IntrHandler@vmkernel#nover+0x97 stack: 0x0, 0x4200316a10c7, 0x750, 0x750, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251ba00:[0x4200316a10c6]gate_entry@vmkernel#nover+0xa7 stack: 0x0, 0x0, 0x0, 0x0, 0x43316537fcf0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bac8:[0x42003116ec32]MCSUnlockWork@vmkernel#nover+0x2e stack: 0x42003216c0c6, 0x453b0251baf0, 0x433164e01480, 0x0, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bad0:[0x420032164e25]nmlx_Complete@(nmlx5_core)#<None>+0x1a stack: 0x453b0251baf0, 0x433164e01480, 0x0, 0x0, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bae0:[0x42003216c0c5]nmlx5_CompleteEnt@(nmlx5_core)#<None>+0x13e stack: 0x0, 0x0, 0x0, 0x1, 0x43316537fcc0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bb20:[0x42003216ca82]nmlx5_CmdCompHandler@(nmlx5_core)#<None>+0x127 stack: 0x452318bba880, 0x433164e01480, 0x420032214c40, 0x4200321d854f, 0x80
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bb60:[0x420032170195]nmlx5_MSIxISR@(nmlx5_core)#<None>+0x1fa stack: 0x453b0251bb88, 0x0, 0x453b0251bb90, 0x4303c6f0f390, 0x18b
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bbc0:[0x42003115fa3b]IntrCookieBH@vmkernel#nover+0x170 stack: 0x4303c6f04ea0, 0x1, 0x4303c6f04ea0, 0x4303c6f0f320, 0x3a
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bc50:[0x42003113f98c]BH_DrainAndDisableInterrupts@vmkernel#nover+0x159 stack: 0x420052801570, 0x8d2639427aacd, 0x0, 0x100000000, 0x420052801040
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bcc0:[0x4200311616ff]IntrCookie_VmkernelInterrupt@vmkernel#nover+0xa4 stack: 0x4d, 0x4200316a890b, 0x0, 0x0, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bce0:[0x4200316a890a]IDT_IntrHandler@vmkernel#nover+0x97 stack: 0x0, 0x4200316a10c7, 0x750, 0x750, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bd00:[0x4200316a10c6]gate_entry@vmkernel#nover+0xa7 stack: 0x0, 0x0, 0x8898, 0x414, 0x4303c74cd260
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bdc8:[0x420031090ea7]Power_ArchPerformWait@vmkernel#nover+0x157 stack: 0x420052801880, 0x800000000, 0x100000414, 0x420052800000, 0x420052800000
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bdd0:[0x420031090f75]Power_ArchSetCState@vmkernel#nover+0xba stack: 0x800000000, 0x100000414, 0x420052800000, 0x420052800000, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251be20:[0x4200316da1ed]CpuSchedIdleLoopInt@vmkernel#nover+0x292 stack: 0x0, 0x7fffffffffffffff, 0x1, 0x7fffffffffffffff, 0xfffffffffffffff6
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251be90:[0x4200316dedc5]CpuSched_IdleLoop@vmkernel#nover+0x12 stack: 0x4a, 0x4200310804c8, 0x0, 0x0, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251bea0:[0x42003115efaa]Init_APIdle@vmkernel#nover+0x3f stack: 0x0, 0x0, 0x0, 0x0, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)0x453b0251beb0:[0x4200310804c7]SMPAPIdle@vmkernel#nover+0x27c stack: 0x0, 0x0, 0x0, 0x0, 0x0
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)CPU model name: AMD EPYC 9554 64-Core Processor , FMS: 19/11/1, uCodeRev: XXXXXXXXX
YYYY-MM-DDTHH:MM:SS.182Z cpu74:2097226)PRODUCTNAME:ProLiant DL385 Gen11, VENDORNAME:HPE, SERIAL_NUMBER:XXXXXXXXXXX, SERVER_UUID:3XXXXXXX0-3XX1-5XXX-32XX-3XXXXXXXXXXXX, VERSION:, SKU:XXXXXXXX1, FAMILY:ProLiant
ESXi 8.x
Memory dumps show the symptom of interrupt storm from iLO on server HPE ProLiant DL385 Gen11.
As in the following HPE KB.
https://support.hpe.com/hpesc/public/docDisplay?docLocale=en_US&docId=a00143662en_us
We recommend updating iLO driver to 10.9.1 or later to address the issue.