ESXi host crash due to frequent PSOD (purple screen of death) with different backtrace for each crash
search cancel

ESXi host crash due to frequent PSOD (purple screen of death) with different backtrace for each crash

book

Article ID: 378571

calendar_today

Updated On:

Products

VMware vSphere ESXi 8.0

Issue/Introduction

  • ESXi crash frequently without any unique functions in the backtrace, each crash might show different backtrace.
  • Following are some of the observed backtrace on the same ESXi host:

    PSOD - 1

    VMware ESXi 8.0.2 [Releasebuild-23825572 x86_64]
    #PF Exception 14 in world <WorldID>:hostd-worker IP 0x42000741d1ff addr 0x420000000000
    PTEs:0x20bcc2c023:0x20a6537023: 0x0;

    Module(s) involved in panic: [vmkernel Version Releasebuild-23825572]
    *PCPU30:<WorldID>/hostd-worker
    PCPU 0: UUUUUISSSUUISSIIIISSSUSUSUUUUUUSSSSS
    [0x42000741d1ff]DeviceDupID@vmkernel#nover+0x53
    [0x42000741e875]Device_GetDeviceIDOvmkernel#nover+0x4a
    [0x4200077f4b9e]PCI_GetPCIDeviceFromVMKDevice@vmkernel#nover+0x2b
    [0x4200074bde28]vmk_PCIGetDevice@vmkernel#nover+0x15
    [0x4200079119ce]PsaStorVsiAdapterInfoGet@vmkernel#nover+0x40f
    [0x420007401f49]VSI_GetInfo@vmkernel#nover+0x192
    [0x4200079f0a0b]UW64VMKSysca1lUnpackVSI_Get@vmkernel#nover+0x168
    [0x4200079841c8]User_UWVMK64SyscallHandler@vmkernel#nover+0x129
    [0x420007564c54]Sysca11UWVMK64@vmkernel#nover+0x90

    PSOD - 2

    VMware ESXi 8.0.2 [Releasebuild-23825572 x86_64]

    #PF Exception 14 in world <WorldID>:HELPER_MISC_ IP 0x4200382516fa addr 0x3ea228
    PTEs :0x20ab3a8023:0x14426f023:0x0;

    Module(s) involved in panic: [vmkernel Version Releasebuild-23825572]
    *PCPU30:<WorldID>/HELPER_MISC_INFRA_QUEUE
    PCPU 0: SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
    [0x4200382516fa]dict_repeat@vmkernel#nover+0x42
    [0x420038251d0f]lzma_main@vmkernel#nover+0x5d4
    [0x420038252735]xz_dec_lzma2_run@vmkernel#nover+0x6be
    [0x420038252e06]xz_dec_run@vmkernel#nover+0x28f
    [0x420037c17af2]Decompress_XZSingle@vmkernel#nover+0x8b
    [0x420037c1940b]BootModuleDecompressModule@vmkernel#nover+0x1b0
    [0x420037cf303c]HelperQueueFunc@vmkernel#nover+0x19d
    [0x42003802c149]CpuSched_StartWorld@vmkernel#nover+0xe2
    [0x420037cdbe7f]Debug_IsInitialized@vmkernel#nover+0xc

    PSOD - 3

    VMware ESXi 8.0.2 [Releasebuild-23825572 x86_64]
    #PF Exception 14 in world <WorldID>:storageRM IP 0x42002c965fc6 addr 0xffffffffffffff95
    PTEs :0x0;

    Module(s) involved in panic: [vmkernel Version Releasebuild-23825572]
    *PCPU30: <WorldID>/storageRM
    PCPU O: SSUSUUIISIIIISSSUSSSUUIIUUIIUIUSIUSI
    [0x42002c965fc6]StorageLookupPathByName@vmkernel#nover+0x8a
    [0x42002c8e51e4]StorageDevicePathsIteratorNext@vmkernel#nover+0x105
    [0x42002c8aa2a5]PsaStorVsiPerDevicePathList@vmkernel#nover+0x86
    [0x42002c40297b]VSI_GetListInfo@vmkernel#nover+0x1d4
    [0x42002c9f030f]JUW64VMKSyscal1UnpackVSI_GetList@vmkernel#nover+0x214
    [0x42002c9841c8]User_UWVMK64SyscallHandler@vmkernel#nover+0x129
    [0x42002c564c54]Sysca11UWVMK64@vnkernel#nover+0x90

Resolution

This may be a hardware problem as the host is crashing with different backtrace for each PSOD; please contact your hardware vendor to diagnose the hardware.

Replacing the CPU has helped to resolve the issue in similar failures.