cpu0:2352732)@BlueScreen: NMI IPI: Panic requested by another PCPU. RIPOFF(base):RBP:CS [0x1404f8(0x420004800000):0x12b8:0xf48] (Src 0x1, CPU0)
cpu0:2352732)Code start: 0x420004800000 VMK uptime: 11:07:27:23.196
cpu0:2352732)Saved backtrace from: pcpu 0 Heartbeat NMI
cpu0:2352732)0x45394629b8b8:[0x4200049404f7]HeapVSIAddChunkInfo@vmkernel#nover+0x1b0 stack: 0x420005bd611e
cpu0:2352732)0x45394629b8c0:[0x420004943036]Heap_AlignWithTimeoutAndRA@vmkernel#nover+0x1eb stack: 0x431822b49000
cpu0:2352732)0x45394629b940:[0x420005bd611d]J6_NewOnDiskTxn@esx#nover+0x15a stack: 0x43181f200560
cpu0:2352732)0x45394629b9a0:[0x420005bd667d]J6CommitInMemTxn@esx#nover+0x176 stack: 0x1
cpu0:2352732)0x45394629ba50:[0x420005bd318a]J6_CommitMemTransaction@esx#nover+0xe3 stack: 0x1e9400000037
cpu0:2352732)0x45394629baa0:[0x420005bf8ad4]Fil6_UnmapTxn@esx#nover+0x4fd stack: 0x0
cpu0:2352732)0x45394629bbb0:[0x420005bfc891]Fil6UpdateBlocks@esx#nover+0x4e2 stack: 0xff
cpu0:2352732)0x45394629bc30:[0x420005bbc3fe]Fil3UpdateBlocks@esx#nover+0xeb stack: 0x21f9b800
cpu0:2352732)0x45394629bd30:[0x420005bbd425]Fil3_PunchFileHoleWithRetry@esx#nover+0x7e stack: 0x45394629bec8
cpu0:2352732)0x45394629bde0:[0x420005bbdc0d]Fil3_FileBlockUnmap@esx#nover+0x57e stack: 0x43181eeddfd0
cpu0:2352732)0x45394629be90:[0x42000483b5fb]FSSVec_FileBlockUnmap@vmkernel#nover+0x20 stack: 0x45b9414506e0
cpu0:2352732)0x45394629bea0:[0x420004d52c03]VSCSI_ExecFSSUnmap@vmkernel#nover+0x9c stack: 0x430cbe01c170
cpu0:2352732)0x45394629bf10:[0x420004d50ead]VSCSIDoEmulHelperIO@vmkernel#nover+0x2a stack: 0x430cbe001818
cpu0:2352732)0x45394629bf40:[0x4200048d9c19]HelperQueueFunc@vmkernel#nover+0x1d2 stack: 0x4539462a0b48
cpu0:2352732)0x45394629bfe0:[0x420004bb1775]CpuSched_StartWorld@vmkernel#nover+0x86 stack: 0x0
cpu0:2352732)0x45394629c000:[0x4200048c46ff]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0
cpu0:2352732)base fs=0x0 gs=0x420040000000 Kgs=0x0
cpu0:2352732)1 other PCPU is in panic.
NMI: 689: NMI IPI: RIPOFF(base):RBP:CS [0x144c3b(0x420004800000):0x12c0:0xf48] (Src 0x1, CPU0)
cpu0:2352732)NMI: 689: NMI IPI: RIPOFF(base):RBP:CS [0x104eff(0x420004800000):0x0:0xf48] (Src 0x1, CPU0
VMware vSphere ESXi 7.x
In ESXi 7.0.3 release VMFS added a change to have uniform UNMAP granularities across VMFS & SE Sparse snapshot. As a part of this change maximum UNMAP granularity reported by VMFS was adjusted to 2GB. A TRIM/UNMAP request of 2GB issued from Guest OS can in rare situations result in a VMFS metadata transaction requiring lock acquisition of a large number of resource clusters (greater then 50 resources) which is not handled correctly in resulting in an ESXi PSOD. VMFS metadata transaction requiring lock actions on greater then 50 resource clusters is not common and can happen on aged datastores. This concern only impacts Thin Provisioned VMDKs, Thick, and Eager Zero Thick VMDKs are not impacted.
There are a few options that customers have to work around this issue. Please note that any of these workarounds will prevent the issue from happening, customers only need to choose the workaround that is best for their situation.
1. Revert to the previous version of ESXi that is not impacted by this concern.
REF: https://knowledge.broadcom.com/external/article/316592/reverting-to-a-previous-version-of-esxi.html
2. Convert thin VMDKs to Thick or Eager Zeroed Thick provisioning
REF: Determine the Virtual Disk Format and Convert a Virtual Disk from the Thin Provision Format to a Thick Provision Format
Inflate Thin Virtual Disks
3. Disable TRIM/UNMAP in the Guest OS
Note: Please consult OS documentation on how to adjust TRIM/UNMAP features for a complete understanding of the OS specific configurations needs. Note: functions and capabilities may vary across distributions and versions based on OS specifics.
Examples: https://www.suse.com/support/kb/doc/?id=000019447