This KB is written to advise that this issue may occur, and direct you to reach out to VMware for assistance with resolving this issue.
Symptoms:
a vSAN disk group is taken offline with the vmkernel log message similar to the below examples (note that specific dates, times, and IDs will be different for your environment):
Example 1:2020-05-21T15:22:38.514Z cpu1:1000341425)WARNING: PLOG: DDPCacheIOCb:686: Trying to format a valid metadata block, UUID 52fcff55-6866-a3d0-d0d5-ba4e3c1d9362, type 4, pbn 4398046515647
2020-05-21T15:22:38.514Z cpu0:1000214054)WARNING: PLOG: DDPCompleteDDPWrite:6455: Throttled: DDP write failed Invalid metadata callback
[email protected]#0.0.0.1, diskgroup 5287714a-e5a0-d986-1f12-e0c960878e53 txnScopeIdx 0
2020-05-21T15:22:38.514Z cpu0:1000214054)PLOG: DDPCompleteDDPWrite:6469: Throttled: (DDPWrite): Curr: completeTask, Prev: updateHashmap, Status: Success
2020-05-21T15:22:38.514Z cpu0:1000214054)WARNING: PLOG: PLOGDDPWriteCbFn:655: DDP write failed on device 52fcff55-6866-a3d0-d0d5-ba4e3c1d9362:Invalid metadata (ssdPerm: no)elevIo 0, doDdpCommit yes
2020-05-21T15:22:38.514Z cpu1:1000213133)WARNING: PLOG: PLOGPropagateError:4232: DDP: Propagating error state from original device 52fcff55-6866-a3d0-d0d5-ba4e3c1d9362
2020-05-21T15:22:38.514Z cpu1:1000213133)WARNING: PLOG: PLOGPropagateError:4284: DDP: Propagating error state to MDs in device 5287714a-e5a0-d986-1f12-e0c960878e53
2020-05-21T15:22:38.514Z cpu1:1000213133)PLOG: PLOG_FindAndUpdateDevTelemetryStat:1058: Setting devResState : dev: mpx.vmhba0:C0:T4:L0 cState: 0 nState: 6 isLSE: 0
2020-05-21T15:22:38.514Z cpu1:1000213133)WARNING: PLOG: PLOGPropagateErrorInt:4172: Permanent error event on 52fcff55-6866-a3d0-d0d5-ba4e3c1d9362
2020-05-21T15:22:38.514Z cpu1:1000213133)PLOG: PLOG_FindAndUpdateDevTelemetryStat:1058: Setting devResState : dev: mpx.vmhba0:C0:T3:L0 cState: 7 nState: 7 isLSE: 0
2020-05-21T15:22:38.514Z cpu1:1000213133)WARNING: PLOG: PLOGPropagateErrorInt:4188: Error/unhealthy propagate event on 52934040-4111-9e8c-4d12-ad0f5635b3d6
2020-05-21T15:22:38.514Z cpu1:1000213133)PLOG: PLOG_FindAndUpdateDevTelemetryStat:1058: Setting devResState : dev: mpx.vmhba0:C0:T6:L0 cState: 7 nState: 7 isLSE: 0
2020-05-21T15:22:38.514Z cpu1:1000213133)WARNING: PLOG: PLOGPropagateErrorInt:4188: Error/unhealthy propagate event on 5287714a-e5a0-d986-1f12-e0c960878e53
Example 2:2020-05-21T16:36:22.055Z cpu0:1000341426)WARNING: PLOG: DDPCacheIOCb:686: Trying to format a valid metadata block, UUID 528006c4-3f71-81c4-ae10-0ae7d661bba0, type 3, pbn 3298534904346
2020-05-21T16:36:22.055Z cpu1:1000214313)WARNING: PLOG: DDPCompleteDDPWrite:6455: Throttled: DDP write failed Invalid metadata callback
[email protected]#0.0.0.1, diskgroup 52379c29-607b-e423-f700-dc4386d74c6a txnScopeIdx 0
2020-05-21T16:36:22.055Z cpu1:1000214313)PLOG: DDPCompleteDDPWrite:6469: Throttled: (DDPWrite): Curr: completeTask, Prev: addNewHash, Status: Success
2020-05-21T16:36:22.055Z cpu1:1000214313)WARNING: PLOG: PLOGDDPWriteCbFn:655: DDP write failed on device 528006c4-3f71-81c4-ae10-0ae7d661bba0:Invalid metadata (ssdPerm: no)elevIo 0, doDdpCommit yes
2020-05-21T16:36:22.058Z cpu0:1000214307)PLOG: PLOGElevHandleFailure:2325: Waiting till we process failure ... dev 528006c4-3f71-81c4-ae10-0ae7d661bba0
2020-05-21T16:36:22.061Z cpu0:1000213234)WARNING: PLOG: PLOGPropagateError:4232: DDP: Propagating error state from original device 528006c4-3f71-81c4-ae10-0ae7d661bba0
2020-05-21T16:36:22.061Z cpu0:1000213234)WARNING: PLOG: PLOGPropagateError:4284: DDP: Propagating error state to MDs in device 52379c29-607b-e423-f700-dc4386d74c6a
2020-05-21T16:36:22.061Z cpu0:1000213234)PLOG: PLOG_FindAndUpdateDevTelemetryStat:1058: Setting devResState : dev: mpx.vmhba0:C0:T4:L0 cState: 0 nState: 6 isLSE: 0
2020-05-21T16:36:22.061Z cpu0:1000213234)WARNING: PLOG: PLOGPropagateErrorInt:4172: Permanent error event on 528006c4-3f71-81c4-ae10-0ae7d661bba0
2020-05-21T16:36:22.061Z cpu0:1000213234)PLOG: PLOG_FindAndUpdateDevTelemetryStat:1058: Setting devResState : dev: mpx.vmhba0:C0:T3:L0 cState: 7 nState: 7 isLSE: 0
2020-05-21T16:36:22.061Z cpu0:1000213234)WARNING: PLOG: PLOGPropagateErrorInt:4188: Error/unhealthy propagate event on 52c30f7b-abfb-3bf2-2bb1-6ed690e7d4f3
2020-05-21T16:36:22.061Z cpu0:1000213234)PLOG: PLOG_FindAndUpdateDevTelemetryStat:1058: Setting devResState : dev: mpx.vmhba0:C0:T6:L0 cState: 7 nState: 7 isLSE: 0
2020-05-21T16:36:22.061Z cpu0:1000213234)WARNING: PLOG: PLOGPropagateErrorInt:4188: Error/unhealthy propagate event on 52379c29-607b-e423-f700-dc4386d74c6a
2020-05-21T16:36:25.915Z cpu0:1000214307)PLOG: PLOGRelogBase:226: RELOG: relogTask exit requested
2020-05-21T16:36:25.915Z cpu0:1000214307)PLOG: PLOGRelogExit:605: RELOG task exiting UUID 52379c29-607b-e423-f700-dc4386d74c6a Success
Example 3:2020-05-21T16:56:00.941Z cpu1:1000341426)WARNING: PLOG: DDPCacheIOCb:686: Trying to format a valid metadata block, UUID 521d473e-2bd4-d796-b250-0587bd83fae9, type 5, pbn 5497558160057
2020-05-21T16:56:00.941Z cpu0:1000213922)WARNING: PLOG: DDPCompleteDDPWrite:6455: Throttled: DDP write failed Invalid metadata callback
[email protected]#0.0.0.1, diskgroup 5247de40-f42b-a0e3-a310-b4e7a2f5cbee txnScopeIdx 0
2020-05-21T16:56:00.941Z cpu0:1000213922)PLOG: DDPCompleteDDPWrite:6469: Throttled: (DDPWrite): Curr: completeTask, Prev: readXmap, Status: Success
2020-05-21T16:56:00.941Z cpu0:1000213922)WARNING: PLOG: PLOGDDPWriteCbFn:655: DDP write failed on device 521d473e-2bd4-d796-b250-0587bd83fae9:Invalid metadata (ssdPerm: no)elevIo 0, doDdpCommit yes
2020-05-21T16:56:00.941Z cpu0:1000213916)PLOG: PLOGElevHandleFailure:2325: Waiting till we process failure ... dev 521d473e-2bd4-d796-b250-0587bd83fae9
2020-05-21T16:56:00.941Z cpu0:1000213152)WARNING: PLOG: PLOGPropagateError:4232: DDP: Propagating error state from original device 521d473e-2bd4-d796-b250-0587bd83fae9
2020-05-21T16:56:00.941Z cpu0:1000213152)WARNING: PLOG: PLOGPropagateError:4284: DDP: Propagating error state to MDs in device 5247de40-f42b-a0e3-a310-b4e7a2f5cbee
2020-05-21T16:56:00.941Z cpu0:1000213152)PLOG: PLOG_FindAndUpdateDevTelemetryStat:1058: Setting devResState : dev: mpx.vmhba0:C0:T4:L0 cState: 0 nState: 6 isLSE: 0
2020-05-21T16:56:00.943Z cpu0:1000213152)WARNING: PLOG: PLOGPropagateErrorInt:4172: Permanent error event on 521d473e-2bd4-d796-b250-0587bd83fae9
2020-05-21T16:56:00.943Z cpu0:1000213152)PLOG: PLOG_FindAndUpdateDevTelemetryStat:1058: Setting devResState : dev: mpx.vmhba0:C0:T3:L0 cState: 7 nState: 7 isLSE: 0
2020-05-21T16:56:00.943Z cpu0:1000213152)WARNING: PLOG: PLOGPropagateErrorInt:4188: Error/unhealthy propagate event on 52cd91e1-e659-8d2c-f431-4e1e923217d0
2020-05-21T16:56:00.943Z cpu0:1000213152)PLOG: PLOG_FindAndUpdateDevTelemetryStat:1058: Setting devResState : dev: mpx.vmhba0:C0:T6:L0 cState: 7 nState: 7 isLSE: 0
2020-05-21T16:56:00.944Z cpu0:1000213152)WARNING: PLOG: PLOGPropagateErrorInt:4188: Error/unhealthy propagate event on 5247de40-f42b-a0e3-a310-b4e7a2f5cbee
2020-05-21T16:56:04.066Z cpu0:1000213916)PLOG: PLOGRelogBase:226: RELOG: relogTask exit requested