This KB addresses a rare condition causing vSAN IO operations to buildup on the cache devices and not be processed which eventually leads to LogCongestion and performance degradation.
Symptoms:
Running any vSAN OSA version prior to 7.0U3l or 8.0U1 with UNMAP enabled and one or all of the following symptoms and/or conditions are present in the cluster :
- Significant performance degradation on the vSAN cluster
- One or more disk group(s) experiencing log congestion
- You may run the following commands to verify the current vSAN UNMAP settings -
vSAN GuestUnmap:
#esxcfg-advcfg -g /VSAN/GuestUnmap
Value of GuestUnmap is 1 -> a value of ‘1’ means it is enabled while ‘0’ indicates it is disabled.
vSAN unmapFairness:
#esxcfg-advcfg -g /LSOM/unmapFairness
Value of unmapFairness is 1 -> a value of ‘1’ means it is enabled which is the default configuration after vSAN 7.0U1.
If you suspect your environment has encountered this condition, open a service request referencing this KB article so VMware vSAN GS can provide additional triage using an internal script that measures different levels of consumption at the cache device layer.