Error: "Failed to ack TLB invalidate" on ARM hardware
search cancel

Error: "Failed to ack TLB invalidate" on ARM hardware

book

Article ID: 404393

calendar_today

Updated On:

Products

VMware vSphere ESX 8.x

Issue/Introduction

  • An ESX on ARM 8 server has the following errors in var/run/log/vmkernel.log:

    2025-07-08T07:49:46.150Z cpu0:2097819)ApeiPageRetire: 449: Error Type: 0 CACHE_ERROR
    2025-07-08T07:49:46.150Z cpu0:2097819)ApeiPageRetire: 461: Physical Fault Address: 0x8000400105349440
    2025-07-08T07:49:46.150Z cpu0:2097819)ApeiPageRetire: 497: Vendor Specific Error Info 48
    ----------------------------------------------------------------------------------------------------------------------------------
    2025-07-08T07:49:52.757Z cpu0:2097819)ApeiPageRetire: 415: MPIDR : 0x181120000

    2025-07-08T07:49:52.757Z cpu0:2097819)ApeiPageRetire: 427: MIDR :  0x413fd0c1
    2025-07-08T07:49:52.757Z cpu0:2097819)ApeiPageRetire: 432: Arm Error Info #0
    2025-07-08T07:49:52.757Z cpu0:2097819)ApeiPageRetire: 433: Num errors: 1
    2025-07-08T07:49:52.757Z cpu0:2097819)ApeiPageRetire: 449: Error Type: 0 CACHE_ERROR
    2025-07-08T07:49:52.757Z cpu0:2097819)ApeiPageRetire: 461: Physical Fault Address: 0x8000400105349440
    2025-07-08T07:49:52.757Z cpu0:2097819)ApeiPageRetire: 497: Vendor Specific Error Info 48

  • Also the ESX may crash with the following backtrace:

    2025-07-01T21:49:25.754Z cpu83:2097318)WARNING: Heartbeat: 961: PCPU 0 didn't have a heartbeat for 6 seconds, timeout is 10, 1 IPIs sent; *may* be locked up.ESC[0m
    2025-07-01T21:49:25.754Z cpu83:2097318)Heartbeat: 1014: Sending timer IPI to PCPU 0
    2025-07-01T21:49:25.884Z cpu8:2098251)Backtrace for current CPU #8, worldID=2098251, fp=0x4579a259ba20
    2025-07-01T21:49:25.884Z cpu8:2098251)0x4579a259ba20:[0x420013f9b0e4]PanicvPanicInt@vmkernel#nover+0x114 stack: 0x0, 0x8, 0x427f00000000, 0x4579a259bc98, 0x1
    2025-07-01T21:49:25.884Z cpu8:2098251)0x4579a259bb00:[0x420013f9b918]Panic_NoSave@vmkernel#nover+0x60 stack: 0x4579a259bb90, 0x4579a259bb90, 0x4579a259bb50, 0xffffffc8, 0x4579a259bb90
    2025-07-01T21:49:25.884Z cpu8:2098251)0x4579a259bb90:[0x420013fb0354]TLBGetLockedCPUBacktraces@vmkernel#nover+0x1c0 stack: 0x186a0, 0x4579a259bee0, 0x80b66, 0x41ffd3ee70dc, 0x41ffd3ef3000
    2025-07-01T21:49:25.884Z cpu8:2098251)0x4579a259be30:[0x420013fb06d0]TLBDoInvalidate@vmkernel#nover+0x278 stack: 0x436a90202cf0, 0x436a90202d00, 0x1, 0x41ffd3ef3000, 0x436a90202d00
    2025-07-01T21:49:25.884Z cpu8:2098251)0x4579a259bea0:[0x42001443c62c]UserMem_CartelFlush@vmkernel#nover+0xa4 stack: 0x436990c06fd8, 0x7291f57955, 0x436990c07008, 0x420014445548, 0x43523f222028
    2025-07-01T21:49:25.884Z cpu8:2098251)0x4579a259bf60:[0x4200144456f4]UserMemTouchedEstimationLoop@vmkernel#nover+0xf4 stack: 0x4579a259f100, 0x0, 0x0, 0x0, 0x0
    2025-07-01T21:49:25.884Z cpu8:2098251)0x4579a259bfe0:[0x4200144fb87c]CpuSched_StartWorld@vmkernel#nover+0x60 stack: 0x0, 0x4200142d6844, 0x0, 0x0, 0x0
    2025-07-01T21:49:25.884Z cpu8:2098251)0x4579a259c000:[0x4200142d6840]CpuSched_UseMwaitCallback@vmkernel#nover+0x130 stack: 0x0, 0x0, 0x0, 0x0, 0x0

    ----------------------------------------------------------------------------------------------------------------------------------

    2025-07-01T21:49:25.895Z cpu8:2098251)@BlueScreen: PCPU 0 locked up. Failed to ack TLB invalidate (at least 1 locked up, PCPU(s): 0).
    PCPU(s) did not respond to NMI. Possible hardware problem; contact hardware vendor.
    2025-07-01T21:49:25.895Z cpu8:2098251)Code start: 0x420013e00000 VMK uptime: 0:05:28:08.143
    2025-07-01T21:49:25.895Z cpu8:2098251)0x4579a259ba20:[0x420013f9b0e4]PanicvPanicInt@vmkernel#nover+0x114 stack: 0x0
    2025-07-01T21:49:25.896Z cpu8:2098251)0x4579a259bb00:[0x420013f9b918]Panic_NoSave@vmkernel#nover+0x60 stack: 0x4579a259bb90
    2025-07-01T21:49:25.896Z cpu8:2098251)0x4579a259bb90:[0x420013fb0354]TLBGetLockedCPUBacktraces@vmkernel#nover+0x1c0 stack: 0x186a0
    2025-07-01T21:49:25.896Z cpu8:2098251)0x4579a259be30:[0x420013fb06d0]TLBDoInvalidate@vmkernel#nover+0x278 stack: 0x436a90202cf0
    2025-07-01T21:49:25.896Z cpu8:2098251)0x4579a259bea0:[0x42001443c62c]UserMem_CartelFlush@vmkernel#nover+0xa4 stack: 0x436990c06fd8
    2025-07-01T21:49:25.897Z cpu8:2098251)0x4579a259bf60:[0x4200144456f4]UserMemTouchedEstimationLoop@vmkernel#nover+0xf4 stack: 0x4579a259f100
    2025-07-01T21:49:25.897Z cpu8:2098251)0x4579a259bfe0:[0x4200144fb87c]CpuSched_StartWorld@vmkernel#nover+0x60 stack: 0x0
    2025-07-01T21:49:25.897Z cpu8:2098251)0x4579a259c000:[0x4200142d6840]CpuSched_UseMwaitCallback@vmkernel#nover+0x130 stack: 0x0


Environment

ESX on ARM

Cause

It is suspected that this is due to hardware issues with the underlying ARM hardware.

Resolution

Engage with the hardware vendor for further investigation of the memory errors. 

Additional Information

ESX on ARM is as fling project and is community supported. The community can be found:

Flings

Code