Symptoms:
WARNING: CACHE_SLOW ent: #Rder:1, #Pin:0, rwOwner:2165751, roOwner:2165744, self:0x210bf7, entry:0x476bcc187660
WARNING: ZDOMBLKCACHE: CACHE_SLOW: X Block {9fxxxxx-xxxx-xxxx-xxxx-xx44, 166, 1459309}, type: MiddleTree blocked 27280.4 sec.
WARNING: CACHE_SLOW ent: ref:0, dirty:1, skipRd:0, hasWt:0, inIO:0, inSet:1, del:0, WriterWait:1, flush:1
WARNING: CACHE_SLOW ent: #Rder:1, #Pin:0, rwOwner:2165751, roOwner:2165744, self:0x210bf7, entry:0x476bcc187660
WARNING: ZDOMBLKCACHE: CACHE_SLOW: X Block {9fxxxxx-xxxx-xxxx-xxxx-xx44, 166, 1459309}, type: MiddleTree blocked 27283.4 sec.
WARNING: CACHE_SLOW ent: ref:0, dirty:1, skipRd:0, hasWt:0, inIO:0, inSet:1, del:0, WriterWait:1, flush:1
WARNING: CACHE_SLOW ent: #Rder:1, #Pin:0, rwOwner:2165751, roOwner:2165744, self:0x210bf7, entry:0x476bcc187660
Note: If a PSOD is observed then the below PSOD stack is reported in the logs; however exact logging may vary.
2024-06-10T10:51:32.574Z cpu47:2099455)DOM: DOMOwnerUnsubscribeClusterEncrState:5934: DOM Owner on 21950e66-6c6c-8693-3e0e-bc97e1055ba0 unsubscribed cluster encryption state
2024-06-10T10:51:32.722Z cpu20:2099455)DOM: DOMOwnerUnsubscribeClusterEncrState:5925: DOM Owner on 21950e66-6c6c-8693-3e0e-bc97e1055ba0 received premature cluster encryption state unsubscription
cpu28:2099455)@BlueScreen: 05915d66-80b3-5c28-b78e-bc97e1055ba0: Failed to wait for object exit.
cpu28:2099455)Code start: 0x420000a00000 VMK uptime: 0:09:29:58.854
cpu28:2099455)0x453ab6d9b920:[0x420000b19b5a]PanicvPanicInt@vmkernel#nover+0x202 stack: 0x420042801100
cpu28:2099455)0x453ab6d9b9d0:[0x420000b1a47c]Panic_vPanic@vmkernel#nover+0x25 stack: 0x0
cpu28:2099455)0x453ab6d9b9f0:[0x420000b32560]vmk_PanicWithModuleID@vmkernel#nover+0x41 stack: 0x453ab6d9ba50
cpu28:2099455)0x453ab6d9ba50:[0x4200030dd63d][email protected]#0.0.0.1+0xf36 stack: 0x5c
cpu28:2099455)0x453ab6d9bec0:[0x420003190c61][email protected]#0.0.0.1+0x12 stack: 0x3e0fb637fc74
cpu28:2099455)0x453ab6d9bee0:[0x420002a1ae97][email protected]#0.0.0.1+0x230 stack: 0x3e0fb63873be
cpu28:2099455)0x453ab6d9bfa0:[0x420000b3a234]vmkWorldFunc@vmkernel#nover+0x31 stack: 0x420000b3a230
cpu28:2099455)0x453ab6d9bfe0:[0x420000e2c015]CpuSched_StartWorld@vmkernel#nover+0xe2 stack: 0x0
cpu28:2099455)0x453ab6d9c000:[0x420000adbdff]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0
cpu28:2099455)base fs=0x0 gs=0x420047000000 Kgs=0x0
The PSOD and TX hang both have the same trigger but each may occur independently of each other.
The Broadcom bnxtnet async driver version 224.0.x.x or later has an issue that can miss TX packet completion under certain circumstances. This could block the VM's vNIC TX queues, and thus block some or all packets leaving the vNIC.
Broadcom has released new versions of bnxtnet and bnxtroce drivers containing the fix, starting with version 226.0.145.4-1.
Please consult the VCG (HCL) or your OEM for the driver and firmware version matching the specific NIC model.
Downgrading the Driver version is not possible as a new issue been identified on version below 224.0.x.x