/etc/init.d/clomd status", while no abnormal status can be found in clomd.logvsansystem.log, there is error such as:YYYY-MM-DDTHH:MM:SS verbose vsansystem[#########] [vSAN@6876 sub=PyBackedMO opId=<opId>] Invoke vim.host.VsanHealthSystem.checkClomdLiveness failed: (vim.fault.VsanFault) {--> faultMessage = (vmodl.LocalizableMessage) [--> (vmodl.LocalizableMessage) {--> key = 'com.vmware.vsan.health.msg.clomd.connecterror',--> message = "Cannot connect to clomd process and possibly it's downvsanmgmt.log, there is "MEMORY PRESSURE", such as:YYYY-MM-DDTHH:MM:SS info vsand[#######] [opID=MainThread statsdaemon::_logDaemonMemoryStats] Daemon memory stats: eMin=141.860MB, eMinPeak=174.080MB, rMinPeak=175.204MB MEMORY PRESSUREYYYY-MM-DDTHH:MM:SS info vsand[#######] [opID=MainThread statsdaemon::_logDaemonMemoryStats] Daemon memory stats: eMin=141.860MB, eMinPeak=174.080MB, rMinPeak=175.204MB MEMORY PRESSUREYYYY-MM-DDTHH:MM:SS info vsand[#######] [opID=MainThread statsdaemon::_logDaemonMemoryStats] Daemon memory stats: eMin=141.860MB, eMinPeak=174.080MB, rMinPeak=175.204MB MEMORY PRESSUREvSAN 7.x
This issue was caused by vsanmgmt high memory usage.
Workaround:
Either workaround below can be used to remediate this issue:
1. When clomd abnormal issue happens again, login the ESXi host command line and run "/etc/init.d/vsanmgmtd restart" to restart vsanmgmt service
2. Run command below to increase the vsanmgmtd memory upper limit of vsanperfsvc memory pool to 400 MB
#localcli --plugin=/usr/lib/vmware/esxcli/int/lib64/libsched-internal.so sched group setmemconfig -g host/vim/vmvisor/vsanperfsvc -i 100 -m 400 -u mb
Command below can be used to check memory upper limit of vsanperfsvc memory pool:
#localcli --plugin=/usr/lib/vmware/esxcli/int/lib64/libsched-internal.so sched group getmemconfig -g host/vim/vmvisor/vsanperfsvc