PSOD WITH VSAN ESA AND OBJECTS UNABLE TO RESYNC
search cancel

PSOD WITH VSAN ESA AND OBJECTS UNABLE TO RESYNC

book

Article ID: 317859

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

This article is written to inform of this issue, the workaround, and the resolution.

Impact/Risks:

This problem may result in VMs crashing on the host with the PSOD and being unable to restart. Restore from backup may be required.

Symptoms:

This problem is specific to vSAN ESA in 8.0.

 

One or more VMs may crash and resync for objects may be stuck and unable to progress following the PSOD. VMs associated may be unable to recover and must be restored from backup.

 

PSOD example below. Details such as time, UUIDs, worlds, or system names will vary.

 

Panic Details: Crash at 2023-04-03T19:40:53.279Z on CPU 88 running world 2099105 - VSAN_0x4329bf0b3d40_Owner. VMK Uptime:6:19:58:40.098

Panic Message: @BlueScreen: ########-####-####-####-########046c: Failed to wait for object exit.

Backtrace:

  0x453a2f59b920:[0x420020710c59]PanicvPanicInt@vmkernel#nover+0x1f9 stack: 0x420020766998, 0x420020710c59, 0x0, 0x420000000001, 0x420020710c59

  0x453a2f59b9d0:[0x420020711544]Panic_vPanic@vmkernel#nover+0x25 stack: 0x0, 0x420020728e29, 0x3, 0x420000000010, 0x453a2f59ba50

  0x453a2f59b9f0:[0x420020728e28]vmk_PanicWithModuleID@vmkernel#nover+0x41 stack: 0x453a2f59ba50, 0x453a2f59ba10, 0x0, 0x0, 0xc4

  0x453a2f59ba50:[0x42002269cd24][email protected]#0.0.0.1+0x785 stack: 0xfa, 0x7f, 0x42, 0x90, 0xe8

  0x453a2f59beb0:[0x4200226838f7][email protected]#0.0.0.1+0x10 stack: 0x45dad7700f00, 0x420021fec25c, 0x0, 0x4329bf0b3e00, 0x0

  0x453a2f59bed0:[0x420021fec25b][email protected]#0.0.0.1+0x330 stack: 0x0, 0x4329bf0b3ed8, 0x570a653a7f4d6, 0x8, 0x1

  0x453a2f59bf90:[0x420020730baf]vmkWorldFunc@vmkernel#nover+0x40 stack: 0x420020730bab, 0x0, 0x453a12f1f100, 0x453a2f59f000, 0x453a12f1f100

  0x453a2f59bfe0:[0x420020a14f9e]CpuSched_StartWorld@vmkernel#nover+0x7b stack: 0x0, 0x4200206d40d0, 0x0, 0x0, 0x0

0x453a2f59c000:[0x4200206d40cf]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0, 0x0, 0x0, 0x0, 0x0

 

 

Environment

VMware vSAN 8.0.x

Cause

This problem is caused by a deadlock condition with the zDOM Snapshots (unrelated to customer VM snapshots).

Resolution

This problem is resolved in 8.0 Update 1 P02 (8.0 U1c) and later. Please update to 8.0 U1c (build 22088125) or later as soon as possible.


Workaround:

The problem may be mitigated by disabling zDOM snapshots using the following command on all hosts in the cluster, this will persist and should be reverted to 1 after upgrade to 8.0 Update 1 or after:

esxcfg-advcfg -s 0 /VSAN/zDOMSnapshotMode


Additional Information