SSDs experience unexpected failures at 32k/40k power-on hours
search cancel

SSDs experience unexpected failures at 32k/40k power-on hours

book

Article ID: 329029

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:
SSDs experience unexpected failures
Disk groups going offline

Dell firmware below D417
HPE firmware below HPD7
Cisco firmware below C405

To see the current disk firmware and model via ESXi run the following command:

localcli storage core device list | grep -E 'Revision|Model'
 
For more information and affected disks/server models see the below advisories:
HPE SAS SSD Remediation Guide

HPE SAS Solid State Drives - Critical Firmware Upgrade Required for Certain HPE SAS Solid State Drive Models to Prevent Drive Failure at 40,000 Hours of Operation

Dell EMC Enterprise SSDs, model numbers LT0200MO, LT0400MO, LT0800MO, LT1600MO, LT0800RO, LT1600RO, LT0200WM, LT0400WM and LT0800WM unexpected failures at 40,000 power-on hours

Cisco Field Notice: FN - 70545 - SSD Will Fail at 40,000 Power-On Hours - BIOS/Firmware Upgrade Recommended

Cause

Faulty disk firmware

Resolution

Ensure backups are current
Upgrade to the recommended firmware as per the above advisories

Dell recommended firmware: D417+
HPE recommended firmware: HPD7+
Cisco recommended firmware C405+

Workaround:
At the first sign of SSDs failing start upgrading the firmware before more disks fail.

If already past this point and there are multiple faults engage Dell/HPE/Cisco support to get replacement SSDs, recreate the disk groups and start restoring from backups.

Engage VMware support if additional assistance is required to get the environment back into a health state.

Additional Information

Impact/Risks:
This has the potential to cause data loss if the disk firmware is not upgraded to the recommended levels as per the above advisories prior to disk failures.

Dell recommended firmware: D417+
HPE recommended firmware: HPD7+
Cisco recommended firmware C405+