PSOD on vSAN host when CMMDS metadata update fails
search cancel

PSOD on vSAN host when CMMDS metadata update fails

book

Article ID: 387631

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

In certain scenarios, where a CMMDS (Cluster Membership, Monitoring, and Directory Services) metadata update from the vSAN witness appliance to the CMMDS leader node encounters an error, the vSAN node will be crashed with a PSOD. This is by design to preserve data integrity.

 

Example PSOD Backtrace:

Panic Details: Crash at 2024-12-28T14:08:15.753Z on CPU 38 running world 2098455 - VSAN_0x431f15bb0040_CMMDSProces. VMK Uptime:95:04:48:47.696
Panic Message: @BlueScreen: Failed at bora/modules/vmkernel/cmmds/cmmds_audit.c:478 -- NOT REACHED
Backtrace:
  0x4538e221b890:[0x42000d97bb40]PanicvPanicInt@vmkernel#nover+0x20c stack: 0x42000dea2874, 0x42000d97bb40, 0x0, 0x420000000001, 0x42000d97bb40
  0x4538e221b940:[0x42000d97c33c]Panic_vPanic@vmkernel#nover+0x25 stack: 0x583006400000200, 0x42000d997c41, 0x40c7463d6981f452, 0xbef01e5100000010, 0x4538e221b9c0
  0x4538e221b960:[0x42000d997c40]vmk_PanicWithModuleID@vmkernel#nover+0x41 stack: 0x4538e221b9c0, 0x4538e221b980, 0x0, 0xd98c1dd400000201, 0x42000fb970f8
  0x4538e221b9c0:[0x42000fb0d9ba][email protected]#0.0.0.1+0x8b stack: 0x0, 0x0, 0x0, 0x0, 0xffffffffffffffff
  0x4538e221ba60:[0x42000fb0dbb0][email protected]#0.0.0.1+0x11 stack: 0x431f15bd5390, 0x42000fb316b7, 0x45390519f000, 0x4538e221bce8, 0x40c7463d6981f452
  0x4538e221ba80:[0x42000fb316b6][email protected]#0.0.0.1+0x23f stack: 0x40c7463d6981f452, 0xbef01e511ff7edb4, 0x4a93c8c9fc110664, 0x942e828ced5cad71, 0x0
  0x4538e221bc30:[0x42000fb3c655][email protected]#0.0.0.1+0x20a stack: 0x213259a56, 0x0, 0x200000000, 0xd0b2725200000040, 0x52b31a5491a85956
  0x4538e221bdb0:[0x42000fb56a6b][email protected]#0.0.0.1+0x38 stack: 0x431f15bd5390, 0x18, 0x4316b3e014d0, 0x42000fb5705c, 0x42000f52e3f4
  0x4538e221bde0:[0x42000fb5705b][email protected]#0.0.0.1+0xb0 stack: 0x3a74d, 0x0, 0xffffffffffffffff, 0xffff, 0x0
  0x4538e221be80:[0x42000fb4af2e][email protected]#0.0.0.1+0x83 stack: 0x431f15bb01c0, 0x431f15bb0040, 0x45f80f5887b98c, 0x4538e221bf50, 0x0
  0x4538e221bed0:[0x42000f531a46][email protected]#0.0.0.1+0x23f stack: 0x431f15bb0100, 0x0, 0x45f80f58885af8, 0x431f15bb01d8, 0x45f80f5887ceac
  0x4538e221bfa0:[0x42000d99f778]vmkWorldFunc@vmkernel#nover+0x31 stack: 0x42000d99f774, 0x0, 0x4538e221f000, 0x4538c9b1f100, 0x4538e221f100
  0x4538e221bfe0:[0x42000ded67b2]CpuSched_StartWorld@vmkernel#nover+0xbf stack: 0x0, 0x42000d944c70, 0x0, 0x0, 0x0
  0x4538e221c000:[0x42000d944c6f]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0, 0x0, 0x0, 0x0, 0x0

Environment

vSAN 7.x and 8.x

Cause

This PSOD is by design to preserve data integrity in the case of a malformed metadata update from the witness appliance.

 

Resolution

Reboot the ESXi host.

 

Please open a case with Broadcom, and upload log bundle from crashed ESXi host for analysis.

Additional Information

This is a rare scenario, and the PSOD is by design to preserve production data integrity.

 

This only applies to metadata updates. If a similar inconsistency is seen with a data update, this is handled by the vSAN code without crashing the ESXi host.