Issue Validation :
vsan storage list confirms:naa.#####9: Device: naa.5#### Display Name: naa.###### Is SSD: true VSAN UUID:##### VSAN Disk Group UUID: #####4 VSAN Disk Group Name: ###### Used by this host: true In CMMDS: true On-disk format version: 19 Deduplication: false Compression: false Checksum: 180###5381 Checksum OK: true Is Capacity Tier: false Encryption Metadata Checksum OK: true Encryption: false DiskKeyLoaded: false Is Mounted: true Creation Time: Tue Apr 7 07:22:36 2026VMWare vSAN 8.x (OSA)
LLOG accumulation occurs when the Commit Flusher stops moving data from the write buffer to the PLOG. This backup is typically caused by high latency or hardware failures on the underlying cache disks
Cause Validation :
for ssd in $(localcli vsan storage list |grep "5###94"|awk '{print $5}'|sort -u);do echo $ssd;vsish -e get /vmkModules/lsom/disks/$ssd/info|grep Congestion;done [root@C####3:/vmfs/volumes/6####/log] for ssd in $(localcli vsan storage list |grep "5###94"|awk '{print $5}'|sort -u);do echo $ssd;vsish -e get /vmkModules/lsom/disks/$ssd/info|grep Congestion;done memCongestion:0 slabCongestion:0 ssdCongestion:0 iopsCongestion:0 logCongestion:252 compCongestion:0 maxDeleteCongestion:0 mdDeleteCongestion:0 memCongestionLocalMax:0 slabCongestionLocalMax:0 ssdCongestionLocalMax:0 iopsCongestionLocalMax:0 logCongestionLocalMax:252 compCongestionLocalMax:0 mdDeleteCongestionLocalMax:0
================================================ ########### NOTE: it will not display anything if zero logCongestion:252
52f###-f###-###-###-#####
LLOG consumption: 23.9982
PLOG consumption: 0.00183868
Total log consumption: 24
2026-04-07T15:31:17.644Z In(182) vmkernel: cpu38:27301040)LSOM: LSOMThrowAsyncCongestionVOB:550: LSOM MemCong in ##### Congestion State: Exceeded. Congestion Threshold: 200 Current Congestion: 204. 2026-04-07T15:37:30.897Z In(182) vmkernel: cpu37:27301040)LSOM: LSOMThrowAsyncCongestionVOB:550: LSOM SSDCong in ######4 Congestion State: Exceeded. Congestion Threshold: 200 Current Congestion: 204. 2026-04-16T01:59:48.986Z In(182) vmkernel: cpu23:27301040)LSOM: LSOMThrowAsyncCongestionVOB:550: LSOM LogCong in 5##### Congestion State: Exceeded. Congestion Threshold: 200 Current Congestion: 202.
2026-04-07T07:22:36.997Z In(182) vmkernel: cpu17:2099564 opID=e511c2e4)LSOMCommon: SSDLOGInitDescForIO:1153: device: 52######94 Recovering ssdlog. It might take a while... 2026-04-07T07:22:36.997Z In(182) vmkernel: cpu17:2099564 opID=e511c2e4)LSOMCommon: SSDLOG_IsValidCP:214: device: 52######4 Invalid checkpoint magic/version. Magic 0x0,ver 0:0 2026-04-07T07:22:36.997Z In(182) vmkernel: cpu17:2099564 opID=e511c2e4)LSOMCommon: SSDLOG_IsValidCP:214: device: 5#### Invalid checkpoint magic/version. Magic 0x0,ver 0:0 2026-04-07T07:22:36.997Z In(182) vmkernel: cpu17:2099564 opID=e511c2e4)LSOMCommon: SSDLOG_Recover:340: device: 5######Both checkpoints are invalid.. Disk needs to be initialized 2026-04-07T07:22:36.997Z In(182) vmkernel: cpu17:2099564 opID=e511c2e4)LSOMCommon: SSDLOGInitDescForIO:1157: device: 5######### SSD is not initialized, initializing...
2026-04-07T12:12:32.252Z In(182) vmkernel: cpu19:2097829)ScsiDeviceIO: 4580: Cmd(0x45bae7d6ddc0) 0x2a, CmdSN 0x1bf65aef from world 0 to dev "#####9" failed H:0xc D:0x0 P:0x0 2026-04-07T12:12:32.450Z Wa(180) vmkwarning: cpu2:2097827)WARNING: HPP: HppScsiThrottleLogForDevice:585: Cmd 0x2a (0x45bac84dabc0, 0) to dev "####9" on path "vmhba1:C0:T8:L0" Failed: 2026-04-07T12:12:32.450Z Wa(180) vmkwarning: cpu2:2097827)WARNING: HPP: HppScsiThrottleLogForDevice:593: Error status H:0x8 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. hppAction = 3
2026-04-17T07:27:36.784Z In(14) vobd[2097588]: [vSANCorrelator] 10290776758635us: [vob.vsan.lsom.diskunhealthy] vSAN device 52######## is unhealthy. 2026-04-17T07:27:36.784Z In(14) vobd[2097588]: [vSANCorrelator] 10290886526702us: [esx.problem.vob.vsan.lsom.diskunhealthy] vSAN device #######4 is unhealthy. 2026-04-17T07:27:36.784Z In(14) vobd[2097588]: [vSANCorrelator] 10290776758643us: [vob.vsan.lsom.diskgrouplogcongested] vSAN diskgroup ###### log is congested. 2026-04-17T07:27:36.784Z In(14) vobd[2097588]: [vSANCorrelator] 10290886526788us: [esx.problem.vob.vsan.lsom.diskgrouplogcongested] vSAN diskgroup 5######## log is congested.
2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: WARNING - Maximum log congestion on VSAN device naa.### 2/2 times.2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Found congestion: Evacuating disk naa.5####..2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Exception getting SMART health status for vSAN disk naa###9.2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Critical SMART health attributes for VSAN device naa.#### are shown below.2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Uncorrectable sectors: 0.2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Reported uncorrectable sectors: 0.2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Sector reallocation events: 0.2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Sectors successfully reallocated: 0.2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Pending sector reallocations: 0.2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Disk command timeouts: 0.2026-04-17T07:27:36Z In(14) vsandevicemonitord[2099430]: Tier 1 (naa.5#####9) failure due to log congestion.
2026-04-17T07:37:37Z In(14) vsandevicemonitord[2099430]: Device naa.#####9 state is DISKGROUP_UNDER_PERM_ERROR
Scenario A (Workaround): Disk State is Healthy , If the reboot clears the error and the disk shows as healthy, proceed with the following steps to rebuild the disk group:
Remove the affected disk group.
Re-create and add the disk group back to the vSAN cluster.
Exit Maintenance Mode on the ESXi host.
Scenario B (Resolution): Disk State Remains Unhealthy (Hardware Failure) If the disk still shows as unhealthy after the reboot, the drive has entered a permanent error state.
Keep the host in Maintenance Mode.