The "Number of elements in the commit tables" is more than 100K and does not decrease over a period of X hours (refer to section Script 2 below)
AND/OR
One or more of the following applies:
DOM: DOM2PCPrintDescriptor:1797: [105568173:0x4313fe8f3718] => Stuck descriptor
LSOM: LSOM_ThrowCongestionVOB:3429: Throttled: Virtual SAN node "HOSTNAME" maximum Memory congestion reached.
LSOM_ThrowAsyncCongestionVOB:1669: LSOM Memory Congestion State: Exceeded. Congestion Threshold: 200 Current Congestion: 204.
Script 1: Verify existing LSOM Memory Congestion on all vSAN Hosts:
while true; do echo "================================================"; date; for ssd in $(localcli vsan storage list |grep "Group UUID"|awk '{print $5}'|sort -u);do echo $ssd;vsish -e get /vmkModules/lsom/disks/$ssd/info|grep Congestion;done; for ssd in $(localcli vsan storage list |grep "Group UUID"|awk '{print $5}'|sort -u);do llogTotal=$(vsish -e get /vmkModules/lsom/disks/$ssd/info|grep "Log space consumed by LLOG"|awk -F \: '{print $2}');plogTotal=$(vsish -e get /vmkModules/lsom/disks/$ssd/info|grep "Log space consumed by PLOG"|awk -F \: '{print $2}');llogGib=$(echo $llogTotal |awk '{print $1 / 1073741824}');plogGib=$(echo $plogTotal |awk '{print $1 / 1073741824}');allGibTotal=$(expr $llogTotal \+ $plogTotal|awk '{print $1 / 1073741824}');echo $ssd;echo " LLOG consumption: $llogGib";echo " PLOG consumption: $plogGib";echo " Total log consumption: $allGibTotal";done;sleep 30; done ;
Sample Output
529dd4dc-####-####-####-############### memCongestion:### >> This value will be higher than 0 slabCongestion:0 ssdCongestion:0 iopsCongestion:0 logCongestion:0 compCongestion:0 memCongestionLocalMax:0 slabCongestionLocalMax:0 ssdCongestionLocalMax:0 iopsCongestionLocalMax:0 logCongestionLocalMax:0 compCongestionLocalMax:0
529dd4dc-####-####-####-###############
LLOG consumption: 0.270882 PLOG consumption: 0.632553 Total log consumption: 0.903435
Script 2 -- Verify current values of "Number of elements in commit tables":
vsish -e ls /vmkModules/lsom/disks/ 2>/dev/null | while read d ; do echo -n ${d/\//} ; vsish -e get /vmkModules/lsom/disks/${d}WBQStats | grep "Number of elements in commit tables" ; done | grep -v ":0$"
Sample output for two Disk Groups on a Host (please verify that lines returned match all Cache disks (ignore any Capacity disks):
529395f3-####-####-####-###############/ Number of elements in commit tables:300891 >> Disk Group affected ( = Value > 100K )
526709f4-####-####-####-###############/ Number of elements in commit tables:289371 >> Disk Group affected ( = Value > 100K )
# esxcfg-advcfg -s 0 /VSAN/ObjectScrubPersistMin
If assistance is required, please open a Ticket with VMware by Broadcom Support.