COMPONENT METADATA HEALTH ERRORS REPORTED in vSAN 6.7 AFTER HOST REBOOT OR DISK GROUP MOUNT
search cancel

COMPONENT METADATA HEALTH ERRORS REPORTED in vSAN 6.7 AFTER HOST REBOOT OR DISK GROUP MOUNT

book

Article ID: 326762

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This KB is written to inform of this issue, and to provide workaround steps and resolution.

Symptoms:
Hosts are running on any version of ESXi/vSAN 6.7
After a host is rebooted or vSAN disk group mounted the vSAN health check/Skyline Health reports Invalid Metadata Components

For all UI and log messages your host name, UUIDs, times, and exact details will vary.

image.png
In vCenter logs you may see similar messages in the vSAN health summary logs:
2021-11-13T00:48:19.992Z INFO vsan-mgmt[healthThread-29842] [VsanHealthSummaryLogUtil::PrintHealthResult opID=06aaed9c] Cluster vSAN Cluster Overall Health : red
Group physicaldisks health : red
Test physdiskoverall health : red
DisksWithIssues: Host Disk OverallHealth MetadataHealth OperationalHealth InCmmds/Vsi OperationalStateDescription Recommendation Uuid
(Host-45891, AbsentVsanDisk(VsanUuid:52C8Dfd6-4D3D-32E2-4A77-B417F57Be086), Red, Green, Red, No/No, PermanentDiskLoss, PleaseRemoveTheDisk, 52C8Dfd6-4D3D-32E2-4A77-B417F57Be086),
(Host-45891, LocalAtaDisk(Naa.55Cd2E414Deecc00), Red, Red, Red, No/No, PermanentDiskLossInDiskGroup, PleaseUnmountTheDiskGroup, 52E14C4F-32Eb-5Eed-F28F-3C9Bf94B50F4),
(Host-45891, LocalAtaDisk(Naa.55Cd2E414Deeecfe), Red, Red, Red, No/No, PermanentDiskLossInDiskGroup, PleaseUnmountTheDiskGroup, 52C2F341-53F5-85Ce-9B34-7Bb420F73A8D),
(Host-45891, LocalAtaDisk(Naa.55Cd2E414Deec136), Red, Red, Red, No/No, PermanentDiskLossInDiskGroup, PleaseUnmountTheDiskGroup, 523Bb81D-1024-Dd5F-C393-8F1391A0Bc5E),
(Host-45891, LocalAtaDisk(Naa.55Cd2E414Deed39A), Red, Red, Red, No/No, PermanentDiskLossInDiskGroup, PleaseUnmountTheDiskGroup, 52364A19-Fbb8-6A3C-3509-24Cac2091223),
Test physdiskcapacity health : yellow
DisksWithIssues: Host Disk Capacity FreeSpace RebalanceState Uuid
(Host-45891, LocalAtaDisk(Naa.55Cd2E414Deed5B8), Yellow, 43.99Gb(5%), ReactiveRebalanceTaskIsInProgress, 52Cba6F1-Dbe2-A544-48Fc-0B2D6F31E60A), (Host-45891, localAtaDisk(Naa.55Cd2E414Deeede4), Yellow, 59.77Gb(8%
), ReactiveRebalanceTaskIsInProgress, 5256761D-7362-630A-8458-B9E107Ee0Be6),
(Host-45891, LocalAtaDisk(Naa.55Cd2E414Deec9C3), Yellow, 59.08Gb(7%), ReactiveRebalanceTaskIsInProgress, 5210A8Ad-8B62-1Dac-91F4-73Ee0A14Af47), (Host-45891, LocalAtaDisk(Naa.55Cd2E414Deec656), Yellow, 55.50Gb(7%
), ReactiveRebalanceTaskIsInProgress, 527Afb6D-78B8-Ed7E-Fa1C-B8Bcd8B8F665),


Environment

VMware vSphere ESXi 6.7

Cause

This issue is caused when a component has been deleted in normal workflow, but the deletion was interrupted and resulted in only partial completion. The incomplete deletion results in a component left in an orphaned state, which is alerted as invalid metadata. This is detected following boot or disk group mount process.

Resolution

This issue is resolved in ESXi/vSAN 8.0 and later releases.

Workaround:
Remove the impacted disk group with the Ensure Data Accessibility option.
Recreate the disk group.
Allow any needed resync to complete.
In some scenarios, this may not be possible. In this case please contact VMware vSAN Support to use a special tool to remove the impacted components directly.

Additional Information

Impact/Risks:
In most cases the impact is limited to additional unreported space used and the health alarm. This can be mitigated by following the workaround process below.

In some scenarios, the datastore may report being out of space, though the actual space utilization may be within acceptable limits. This is caused by the broken components wrongly accounted in space calculations.