vSAN -- Event logs: vSAN detected and fixed a medium or checksum error for component ######## on disk group ########
search cancel

vSAN -- Event logs: vSAN detected and fixed a medium or checksum error for component ######## on disk group ########

book

Article ID: 405199

calendar_today

Updated On:

Products

VMware vSAN VMware vSAN 7.x VMware vSAN 8.x

Issue/Introduction

You notice some of the following errors in the Host Logs (= Event logs):
Example Output:

YYYY-MM-DDTHH:MM:SS.ZZ cpu17:2099249)WARNING: LSOM: LSOMScrubReadComplete:2838: Throttled: Checksum error detected on component ########-####-####-####-########, comp offset 179160940544 (computed CRC ######## != saved CRC ######## 

YYYY-MM-DDTHH:MM:SS.ZZ info hostd[2105242] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 588 : vSAN detected and fixed a medium or checksum error for component ########-####-####-####-######## on disk group ########-####-####-####-########.

 
Potential occurrence in the following logs:
/var/log/hostd.log
/var/log/vmkwarning.log
/var/log/vmkernel.log
 
 
In addition you observe SCSI Medium Errors reported for one or more vSAN Disks:
Example Output:
YYYY-MM-DDTHH:MM:SS.ZZ cpu4:2098099)ScsiDeviceIO: 4173: Cmd(0x45c00fb53c88) 0x28, CmdSN 0x3aae3a2 from world 0 to dev "################" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x3 0x11 0x1 Medium Error, LBA: ########
 
Potential occurrence in the following logs:
/var/log/vmkernel.log
/var/log/vmkwarning.log
 

Note: In certain cases, the only alert observed in the vobd.log may be:

vSAN detected and fixed a medium or checksum error for component d7a08768-xxxx-xxxx-xxxx-1423f2aec460

If such errors repeatedly occur for the same component, it is recommended to verify the associated disk for that object and involve the hardware team for further investigation

Environment

vSAN 7.x
vSAN 8.x

Cause

Reported Checksum events & issues are caused by underlying SCSI Medium Errors occurring on one or more vSAN Disks (= Disk HW issue).

Some different types of medium errors that can potentially be seen in the logs
0x3 0x3 0x0 - PERIPHERAL DEVICE WRITE FAULT
0x3 0x10 0x0 - ID CRC OR ECC ERROR
0x3 0x11 0x0 - Unrecovered read error
0x3 0x31 0x0 - Medium Format corruption

Resolution

Contact HW Vendor and get the Disk(s) reporting "0x3 0x11 0x1 Medium Error" replaced
Following the instructions to replace the Disk with the new Disk

Alternatively, follow the instructions outlined in KB 397059 -  Section "Resolution" to remove & re-add the reported Disk(s) to allow the System to re-allocate the bad blocks for non-use.

 

Additional Information

See also KB 326850 - Section "Resolution" Step 3 for workaround if running into issues with VMs