PSOD reported on ESXi Hosts when deleting Virtual Machines & Back trace errors with - "Res3VerifyAddrTypesVMFS6" "Fil3TruncateHelperVMFS6"
search cancel

PSOD reported on ESXi Hosts when deleting Virtual Machines & Back trace errors with - "Res3VerifyAddrTypesVMFS6" "Fil3TruncateHelperVMFS6"

book

Article ID: 387559

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

PSOD reported on the ESXi host with back-trace like below due to volume being corrupted on the disk.

Below are the trace snippets

  • Res3VerifyAddrTypesVMFS6
  • Fil3TruncateHelperVMFS6
  • Fil3_SetFileLengthVMFS6
  • Fil3_SetFileLength
  • Fil3RemoveCoreVMFS6
  • FS3DESC_REGFILE
#0  Panic_WithBacktrace (sbt=sbt@entry=0x430178402ea8, fmt=fmt@entry=0x420031df4630 "NMI IPI: Panic requested by another PCPU. RIPOFF(base):RBP:CS [%#lx(%#lx):%#lx:%#lx] (Src %#x, CPU%u)")
    at bora/vmkernel/main/panic.c:147
#1  0x00004200318fbe0d in NMIHandleBtOrHaltRequest (source=NMI_SRC_HEARTBEAT, fullFrame=0x452840082f40) at bora/vmkernel/main/nmi.c:699
#2  NMIHandleIPISource (fullFrame=0x452840082f40) at bora/vmkernel/main/nmi.c:534
#3  NMI_Interrupt (fullFrame=fullFrame@entry=0x452840082f40) at bora/vmkernel/main/nmi.c:753
#4  0x0000420031953393 in IDTNMIWork (fullFrame=0x452840082f40) at bora/vmkernel/main/x86/idt.c:1495
#5  0x00004200319546dd in Int2_NMI (fullFrame=0x452840082f40) at bora/vmkernel/main/x86/idt.c:760
#6  0x000042003194e068 in gate_entry ()
#7  0x00004200329f16a6 in Res3VerifyAddrTypesVMFS6 (numAddrs=511, addrVec=0x431022cd5008, resType=0x431022cd5160) at bora/modules/vmkernel/vmfs/fs3ResourceVMFS6.c:6406
#8  Res3FreeIntVMFS6 (numNull=0x45389ab1b7d8, numFreed=0x45389ab1b7d4, freeOnLastRef=1 '\001', allocUnit=1, numEntries=511, addrVec=0x431022cd5008, txn=0x7, resType=0x431022cd5160)
    at bora/modules/vmkernel/vmfs/fs3ResourceVMFS6.c:5469
#9  Res3Free64VMFS6 (resType=0x431022cd5160, txn=0x7, addrVec=0x431022cd5008, numEntries=511, allocUnit=1, freeOnLastRef=1 '\001', numFreed=0x45389ab1b7d4, numNull=0x45389ab1b7d8)
    at bora/modules/vmkernel/vmfs/fs3ResourceVMFS6.c:5433
#10 0x00004200329e490a in Res3FreeInvalid64VMFS6 (resType=<optimized out>, txn=<optimized out>, addrVec=<optimized out>, numEntries=<optimized out>, allocUnit=<optimized out>,
    freeOnLastRef=<optimized out>, numFreed=0x45389ab1b7d4, numNull=0x45389ab1b7d8) at bora/modules/vmkernel/vmfs/fs3ResourceVMFS6.c:15198
#11 0x00004200329bd393 in Fil3TruncateHelperVMFS6 (txn=txn@entry=0x0, obj=obj@entry=0x4307ce9e9170, newLength=newLength@entry=0, unmapBlocks=unmapBlocks@entry=0 '\000',
    syncReplay=syncReplay@entry=0 '\000') at bora/modules/vmkernel/vmfs/fs3fileVMFS6.c:12939
#12 0x00004200329be734 in Fil3TruncateVMFS6 (txn=txn@entry=0x0, obj=obj@entry=0x4307ce9e9170, length=length@entry=0, unmapBlocks=unmapBlocks@entry=0 '\000', syncReplay=0 '\000')
    at bora/modules/vmkernel/vmfs/fs3fileVMFS6.c:16467
#13 0x00004200329ca7bf in Fil3_SetFileLengthVMFS6 (obj=0x4307ce9e9170, length=0, flags=0) at bora/modules/vmkernel/vmfs/fs3fileVMFS6.c:16259
#14 0x0000420032954083 in Fil3_SetFileLength (flags=0, length=0, obj=0x4307ce9e9170) at bora/modules/vmkernel/vmfs/fs3file.c:384
#15 Fil3_RemoveTruncateFile (parent=parent@entry=0x4307ce81b4a0, child=child@entry=0x4307ce9e9170, childName=childName@entry=0x45389ab1bdc0 "<vm>.vmdk", truncateToOffset=0,
    type=type@entry=FS3DESC_REGFILE) at bora/modules/vmkernel/vmfs/fs3file.c:12009
#16 0x00004200329c477d in Fil3RemoveCoreVMFS6 (type=<optimized out>, childName=0x45389ab1bdc0 "<vm>.vmdk", child=0x4307ce9e9170, parent=0x4307ce81b4a0)
    at bora/modules/vmkernel/vmfs/fs3fileVMFS6.c:4269
#17 Fil3RemoveCoreVMFS6WithRetry (type=<optimized out>, childName=<optimized out>, child=<optimized out>, parent=0x4307ce81b4a0) at bora/modules/vmkernel/vmfs/fs3fileVMFS6.c:932
#18 Fil3RemoveVMFS6 (parent=0x4307ce81b4a0, childName=<optimized out>, type=<optimized out>) at bora/modules/vmkernel/vmfs/fs3fileVMFS6.c:15639
#19 0x0000420032941424 in Fil3_Unlink (parent=0x4307ce81b4a0, childName=0x45389ab1bdc0 "<vm>.vmdk") at bora/modules/vmkernel/vmfs/fs3file.c:2422
#20 0x000042003183ba70 in FSSVec_Unlink (desc=<optimized out>, name=<optimized out>) at bora/vmkernel/filesystems/fsSwitchVec.c:739
#21 0x0000420031837542 in FSS_Unlink (parentOID=0x4307ce81b508, fileName=fileName@entry=0x45389ab1bdc0 "<vm>.vmdk") at bora/vmkernel/filesystems/fsSwitch.c:4201
#22 0x0000420031d05c8e in UserFileUnlink (parent=<optimized out>, arc=0x45389ab1bdc0 "<vm>.vmdk") at bora/vmkernel/user/userFile.c:1299
#23 0x0000420031cbba7f in UserObj_Unlink (uci=uci@entry=0x431d0c202010, vmkacPerm=vmkacPerm@entry=2, path=<optimized out>) at bora/vmkernel/user/userObj.c:4505
#24 0x0000420031d0850c in LinuxFileDesc_Unlink (userPath=<optimized out>) at bora/vmkernel/user/linuxFileDesc.c:1243
#25 0x0000420031cb4d24 in User_LinuxSyscallHandler (fullFrame=0x45389ab1bf40) at bora/vmkernel/user/user.c:2057

Environment

ESXi 7.0

ESXi 8.0

Cause

  •  A Pointer Block(PB) corresponding to a file is corrupted with Invalid address for one of the block.

  • This is making the Resource/Block Free code to return VMK_INVALID_ADDRESS while checking for the validity of the addresses in the PB.

  • Then the VMK_INVALID_ADDRESS is converted to VMK_OK in Fil3TruncateHelperVMFS6.

  • This results in while loop checking for totalToFree to get into infinite loop. Since btFreed, nNull returned from Resource Free are 0s in this case, totalToFree never progresses and causes this PCPU lockup eventually.

Resolution

Fixed in vSphere 7.0 P10 or ESXi 7.0 Update 3n Build 21930508

vSphere 8.0.3 P04 or ESXi 8.0 Update 3b Build 24280767