VMs on vSAN datastore are experiencing performance issue with Memory/SSD/Component Congestion
search cancel

VMs on vSAN datastore are experiencing performance issue with Memory/SSD/Component Congestion

book

Article ID: 414066

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

VMs on vSAN datastore are experiencing performance issue and ESXi showed Memory/SSD/Component Congestion.

# vobd.log
YYYY-MM-DDThh:mm:ss.###Z cpu21:2099098)LSOM: LSOMThrowAsyncCongestionVOB:442: LSOM MemCong in ########-####-####-####-############ Congestion State: Exceeded. Congestion Threshold: 200 Current Congestion: 255.
YYYY-MM-DDThh:mm:ss.###Z cpu21:2099098)LSOM: LSOMThrowAsyncCongestionVOB:442: LSOM SSDCong in ########-####-####-####-############ Congestion State: Exceeded. Congestion Threshold: 200 Current Congestion: 255.
# vsish -e cat /vmkModules/lsom/disks/###/info

:::
   memCongestion:243     ###<- !!!
   slabCongestion:0
   ssdCongestion:0
   iopsCongestion:0
   logCongestion:0
   compCongestion:9      ###<- !!!
   mdCongestion:0
:::

We could also see the high number of elements in commit tables.

# vsish -e ls /vmkModules/lsom/disks/ 2>/dev/null | while read d ; do echo -n ${d/\//} ; vsish -e get /vmkModules/lsom/disks/${d}WBQStats | grep "Number of elements in commit tables" ; done | grep -v ":0$"

########-####-####-####-############   Number of elements in commit tables:2161126 (>100k)

Environment

VMware vSAN 7.X

Cause

This issue occurs in clusters where there are larger objects (which have a concatenated component layout) and is more likely to occur when these larger objects are IO-intensive.

Resolution

This issue is resolved in:

  • ESXi 7.0 Update 3 P06 (build: 20842708)

Workaround:

The impacted Disk-Groups commit-table entries (and thus also the Memory congestion) can be cleared by unmounting and mounting the impacted Disk-Group.
If unmount and mount of the Disk-Groups via the vSphere UI or via CLI is not possible then rebooting the node with congested Disk-Group(s) will also automatically unmount and mount the Disk-Groups as part of node restart process, Maintenance Mode with Ensure Accessibility option may not be possible depending on the severity of the issue.