vSAN disk in "Detached/Unmounted" Operation State and "Ineligible" Claimable State

Products

VMware vSAN

Issue/Introduction

Symptom:

vSAN Skyline health alert reports "Operation health" alarm. On clicking "Troubleshoot" for this alert, it can be seen that there are vSAN disk(s) reporting error:
vSAN disk appears as "Unmounted" Operational state on a vSAN node.(Navigate to vSphere Client > vSAN Cluster > Configure > vSAN - Disk Management > Select host affected host > Click View Disks > Expand Ineligible and unclaimed):
In some instances the device would be in the Operation State "Detached" and Claimable State "Ineligible":

Environment

VMware vSphere vSAN 7.x
VMware vSphere vSAN 8.x

Cause

The physical disk used by vSAN was detected to be faulty.
The disk state being faulty can be validated across multiple host logs -
- SSH to the vSAN node and navigate to path "var/run/log/vobd.log", IO errors detected on the disk will be logged here:
  
  2025-03-03T00:30:07.975Z: [scsiCorrelator] 17381645978626us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.################
  2025-03-03T00:31:22.527Z: [scsiCorrelator] 17381720530080us: [vob.scsi.device.too.many.io.error] Too many errors observed for device naa.################ errPercentage 74
  
  For NVMe devices, the below logging will be seen:
  
  2026-03-25T04:13:29.638Z In(14) vobd[2097955]: [psastorCorrelator] 2477826311967us: [vob.psastor.device.too.many.io.error] Too many errors observed for device t10.NVMe____Dell_Ent_NVMe_P5500_RI_U.2_7.68TB_______000############# errPercentage 100
- In the "var/run/log/vmkwarning.log", log entries similar to:
  
  2025-03-12T05:08:24.209Z cpu67:2101847 opID=3dbce423)WARNING: ScsiDeviceIO: 12155: READ CAPACITY on device "naa.################" from Plugin "HPP" failed. I/O error
  2025-03-12T05:08:33.531Z cpu9:2104069)WARNING: ScsiDeviceIO: 12155: READ CAPACITY on device "naa.################" from Plugin "HPP" failed. I/O error
- In the "var/run/log/vmkernel.log", log entries similar to:
  
  2025-03-12T06:19:59.910Z cpu5:2101934 opID=de30a3c2)WARNING: ScsiDeviceIO: 12155: READ CAPACITY on device "naa.################" from Plugin "HPP" failed. I/O error
  2025-03-12T06:19:59.910Z cpu2:34236323)ScsiDevice: 612: Could not flush cache of local device naa.################. Failure
From the hardware management console for the server, error on the device may be detected as well. Example from iDRAC:

Note: Sometimes, a cold boot of the host from hardware management interface is required to detect the faulty device at the hardware end. Place the host in maintenance mode(ensure accessibility or full data migration), prior to rebooting.

Resolution

To replace the faulty disk:

Place the host with the absent disk in maintenance mode with "Ensure accessibility".
Engage the hardware vendor and get the failed disk replaced physically in the server.
Then depending on type of failed disks (cache or capacity) and if deduplication is enabled or not, follow the below steps to replace the new drive:
1. If deduplication is enabled on the cluster or if the absent disk was a cache device:
  1. Delete the disk group containing the absent vSAN disk.
  2. Re-create the disk group with the existing disks and the new disk.
2. If deduplication is not enabled or if the absent disk was a capacity device:
  1. Remove the absent vSAN disk from the disk group.
  2. Add the new disk to the disk group.

Additional Information

For more details refer Dying Disk Handling (DDH) in vSAN.

vSAN disk in "Detached/Unmounted" Operation State and "Ineligible" Claimable State

Article ID: 391244

Updated On:

Products

Issue/Introduction

Symptom:

Environment

Cause

Resolution

Additional Information

Feedback