SSD congestion detected on a vSAN host
search cancel

SSD congestion detected on a vSAN host

book

Article ID: 421597

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • SSD congestion observed on a vSAN host.
  • In the vobd.logs of the ESXi host, SSD congestion threshold exceeded event will be logged:

2025-11-10T08:16:50.499Z EventEx(esx.problem.vsan.lsom.congestionthreshold) ? LSOM SSDCong in cdsr 52107f71-####-####-####-3ed4411abe1f Congestion State: Exceeded. Congestion Threshold: 200 Current Congestion: 204.'

 

 

Environment

VMware vSAN (All Versions)

Cause

 This can be caused when the SSD/NVMe device encountered an unrecoverable read error (metadata URE) in the disk’s metadata region. This results in the device being marked unhealthy by vSAN LSOM, triggering an automatic evacuation and rebuild. The repeated metadataURE and diskunhealthy events confirm that the disk has underlying medium errors or unreadable sectors, causing it to fail out of the disk group. 

 

  • In the vobd.log file of the ESXi host, the disk encountered unrecoverable read error (metadata URE) in the metadata region is frequently logged:

2025-11-10T07:47:10.262Z In(14) vobd[2098147]:  The event ([esx.problem.vob.vsan.lsom.metadataURE] Device 52107f71-####-####-####-3ed4411abe1f encountered an unrecoverable read error. It is in an unhealthy state and will get evacuated and rebuilt. If this device is part of a dedup diskgroup, the entire disk group will be evacuated and rebuilt.) was sent immediately to hostd;
2025-11-10T07:47:10.262Z In(14) vobd[2098147]:  [vSANCorrelator] 133782649211us: [vob.vsan.lsom.diskunhealthy] vSAN device 52107f71-####-####-####-3ed4411abe1f is unhealthy.
2025-11-10T07:47:10.262Z In(14) vobd[2098147]:  [vSANCorrelator] 133789736213us: [esx.problem.vob.vsan.lsom.diskunhealthy] vSAN device 52107f71-####-####-####-3ed4411abe1f is unhealthy.
2025-11-10T07:47:10.262Z In(14) vobd[2098147]:  The event ([esx.problem.vob.vsan.lsom.diskunhealthy] vSAN device 52107f71-####-####-####-3ed4411abe1f is unhealthy.) was sent immediately to host

Resolution

Engage the hardware vendor to asses the health of the NVMe disk reporting unrecoverable read error (URE).