In certain scenarios, where a CMMDS (Cluster Membership, Monitoring, and Directory Services) metadata update from the vSAN witness appliance to the CMMDS leader node encounters an error, the vSAN node will be crashed with a PSOD. This is by design to preserve data integrity.
Example PSOD Backtrace:
Panic Details: Crash at 2024-12-28T14:08:15.753Z on CPU 38 running world 2098455 - VSAN_0x431f15bb0040_CMMDSProces. VMK Uptime:95:04:48:47.696
Panic Message: @BlueScreen: Failed at bora/modules/vmkernel/cmmds/cmmds_audit.c:478 -- NOT REACHED
Backtrace:
0x4538e221b890:[0x42000d97bb40]PanicvPanicInt@vmkernel#nover+0x20c stack: 0x42000dea2874, 0x42000d97bb40, 0x0, 0x420000000001, 0x42000d97bb40
0x4538e221b940:[0x42000d97c33c]Panic_vPanic@vmkernel#nover+0x25 stack: 0x583006400000200, 0x42000d997c41, 0x40c7463d6981f452, 0xbef01e5100000010, 0x4538e221b9c0
0x4538e221b960:[0x42000d997c40]vmk_PanicWithModuleID@vmkernel#nover+0x41 stack: 0x4538e221b9c0, 0x4538e221b980, 0x0, 0xd98c1dd400000201, 0x42000fb970f8
0x4538e221b9c0:[0x42000fb0d9ba][email protected]#0.0.0.1+0x8b stack: 0x0, 0x0, 0x0, 0x0, 0xffffffffffffffff
0x4538e221ba60:[0x42000fb0dbb0][email protected]#0.0.0.1+0x11 stack: 0x431f15bd5390, 0x42000fb316b7, 0x45390519f000, 0x4538e221bce8, 0x40c7463d6981f452
0x4538e221ba80:[0x42000fb316b6][email protected]#0.0.0.1+0x23f stack: 0x40c7463d6981f452, 0xbef01e511ff7edb4, 0x4a93c8c9fc110664, 0x942e828ced5cad71, 0x0
0x4538e221bc30:[0x42000fb3c655][email protected]#0.0.0.1+0x20a stack: 0x213259a56, 0x0, 0x200000000, 0xd0b2725200000040, 0x52b31a5491a85956
0x4538e221bdb0:[0x42000fb56a6b][email protected]#0.0.0.1+0x38 stack: 0x431f15bd5390, 0x18, 0x4316b3e014d0, 0x42000fb5705c, 0x42000f52e3f4
0x4538e221bde0:[0x42000fb5705b][email protected]#0.0.0.1+0xb0 stack: 0x3a74d, 0x0, 0xffffffffffffffff, 0xffff, 0x0
0x4538e221be80:[0x42000fb4af2e][email protected]#0.0.0.1+0x83 stack: 0x431f15bb01c0, 0x431f15bb0040, 0x45f80f5887b98c, 0x4538e221bf50, 0x0
0x4538e221bed0:[0x42000f531a46][email protected]#0.0.0.1+0x23f stack: 0x431f15bb0100, 0x0, 0x45f80f58885af8, 0x431f15bb01d8, 0x45f80f5887ceac
0x4538e221bfa0:[0x42000d99f778]vmkWorldFunc@vmkernel#nover+0x31 stack: 0x42000d99f774, 0x0, 0x4538e221f000, 0x4538c9b1f100, 0x4538e221f100
0x4538e221bfe0:[0x42000ded67b2]CpuSched_StartWorld@vmkernel#nover+0xbf stack: 0x0, 0x42000d944c70, 0x0, 0x0, 0x0
0x4538e221c000:[0x42000d944c6f]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0, 0x0, 0x0, 0x0, 0x0
vSAN 7.x and 8.x
This PSOD is by design to preserve data integrity in the case of a malformed metadata update from the witness appliance.
Reboot the ESXi host.
Please open a case with Broadcom, and upload log bundle from crashed ESXi host for analysis.
This is a rare scenario, and the PSOD is by design to preserve production data integrity.
This only applies to metadata updates. If a similar inconsistency is seen with a data update, this is handled by the vSAN code without crashing the ESXi host.