vSAN NVMe SSD Congestion: Write buffer exhaustion when using deduplication
search cancel

vSAN NVMe SSD Congestion: Write buffer exhaustion when using deduplication

book

Article ID: 367247

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

  • Deduplication occurs during de-staging, which creates a software-limited bottleneck on high-performance NVMe devices 
    • These symptoms can frequently occur during nightly backup windows when high-volume I/O exceeds de-staging capabilities

  • vobd.log may report heartbeat timeout messages as per the below example:
2026-03-01T06:34:31.470Z: [vmfsCorrelator] 15162515160us: [vob.vmfs.heartbeat.timedout] XXXXXXXX-XXXXXXXX-XXXX-XXXXXXXXXXXX XXXXXXXX-XXXXXXXX-XXXX-XXXXXXXXXXXX
2026-03-01T06:34:31.471Z: [vmfsCorrelator] 15162510112us: [esx.problem.vmfs.heartbeat.timedout] XXXXXXXX-XXXXXXXX-XXXX-XXXXXXXXXXXX XXXXXXXX-XXXXXXXX-XXXX-XXXXXXXXXXXX
  • Cluster-wide latency and/or VMs becoming unresponsive during these periods
    • The extent of the impact may vary depending on the workload and its specific requirements.

Environment

VMware vSAN OSA

Cause

With deduplication, vSAN is only able to de-stage data as fast as the software allows it, this is because the deduplication process takes place during de-staging. Paired with using high performance devices, this can result in a situation where I/O is hitting the write buffers faster than data can be de-staged, causing a bottleneck that results in SSD congestion and cluster wide latency. 

Data resynchronization (following a disk failure) can exacerbate this bottleneck when deduplication is enabled

Resolution

  1. Consider turning off deduplication and using compression only for space savings. 
  2. Upgrading to vSAN 8.x allows for a larger cache tier write buffer (up to 1.6 TB), which can mitigate this issue
  3. Stagger backup job schedules to flatten I/O peaks

Additional Information

VMware vSAN Support for large cache tier write buffer in vSAN 8.0 and higher