Delete congestion is triggered when a vSAN disk group exceeds 70% full and the pending deletes exceed 10% of the Write Buffer size (600GB).These pending deletes for the disk group can happen simply by normal background vSAN rebalancing (which is a very likely event as the disk group becomes more full), by deleting VMs or vDisks, or by svMotion out of the vSAN cluster.
A temporary throttle of incoming I/O might be introduced in order to avoid running out of space during large delete operations resulting in a temporary drop in performance (latency spike) to be experienced by Virtual Machines.
vSAN now provides adaptive delete congestion for compression-only disk groups in vSAN OSA, improving IOPS performance and delivering more predictable application responses.
Below are the steps to monitor the delete congestion with ESXi version 7.0.x.
"Delete congestion" entries are written in the "vsantracesUrgent" logs to identify hosts that are experiencing congestion. Below entries are seen in /var/log/vsantraces.
"DOMTraceFlowctlShowCongestionForRoleAndOpAt" in vsanUrgentTrace log.
vsanTraceReader command to read a vSAN trace file. Execute below command to read and filter the vSAN trace file:/usr/lib/vmware/vsan/bin/vsanTraceReader vsantracesUrgent--#####.gz | grep DOMTraceFlowctlShowCongestionForRoleAndOpAtvsantracesUrgent--2024-11-01T04h03m31s703.txt:2024-11-01T06:44:21.375430 [42245468] [cpu59] [6eadbb91 OWNER commit Transaction] DOMTraceFlowctlShowCongestionForRoleAndOpAt:13669: {'op': 0x45db5d687a00, 'obj': 0x45db89b594c0, 'obj Uuid': '########-####-####-####-############, 'lower layer single congestion': 0, 'lower layer shared congestion': 120, 'scheduler shared congestion': 0, 'role congestion': 0, 'op congestion': 120, 'at': 'At Owner root'}
From 8.0 version, we can monitor the delete congestion in /var/run/log/vmkernel or /var/run/log/vobd.log