Disk group is reported as Unhealthy in the vSAN cluster.

search cancel

Disk group is reported as Unhealthy in the vSAN cluster.

book

Article ID: 400667

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

vCenter Server UI displays the following error under Cluster > Configure > Disk Management:

Disk group status: Unhealthy

Environment

VMware vSAN 7.x
VMware vSAN 8.x

Cause

The cache disks on the ESXi host were missing, resulting in the associated vSAN disk groups being marked as unhealthy.
Validate the disk group configuration on the affected ESXi host and confirm that the cache disk status is marked as Absent in the vSAN UI.

To further validate, review the /var/log/boot.gz file on the ESXi host. Hardware-level asynchronous NVMe command timeout errors will be observed, indicating a potential issue with the physical NVMe device or its controller.

2025-06-09T13:50:45.335Z cpu0:2098077)__nvme_SubmitSyncRequest: adapter:0 qid:0 req:0 error:bad0021 waiting on admin sync command
2025-06-09T13:50:46.843Z cpu0:2098077)nvme_AttachDevice
2025-06-09T13:50:46.843Z cpu0:2098077)nvme_PciInit: adapter->bar: 0x4521cc100000
2025-06-09T13:50:46.844Z cpu0:2098077)CpuSched: 840: user latency of 2098693 intel-nvme-ctrl-0-cqw-69 0 changed by 2098077 vmkdevmgr -6
2025-06-09T13:50:59.846Z cpu0:2098077)__nvme_SubmitSyncRequest: adapter:0 qid:0 req:0 error:bad0021 waiting on admin sync command
2025-06-09T13:51:02.894Z cpu2:2098077)vmd_PciBusAddDevices: NVMe device added = 0
2025-06-09T13:51:02.894Z cpu2:2098077)vmd_EnumerateDevices: NVMe devices attached to VMD = 0

Resolution

It is recommended to engage the hardware vendor to validate the issue and investigate any potential faults with the NVMe device or its controller.

Workaround:

As a temporary measure, power-cycling the ESXi host via the hardware management interface (e.g., iLO, iDRAC, etc.) and verify whether the cache drive is detected and the disk group is restored to a healthy state.

Feedback

thumb_up Yes

thumb_down No