vSAN 8.x reports vSAN daemon liveness in red in vCenter UI and cmmdsTimeMachine is not running when checked from host CLI.
search cancel

vSAN 8.x reports vSAN daemon liveness in red in vCenter UI and cmmdsTimeMachine is not running when checked from host CLI.

book

Article ID: 389004

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

 

vCenter UI reports similar messages below:

 

Restarted cmmdsTimeMachine: 
/etc/init.d/cmmdsTimeMachine restart:
There are 1 /etc/init.d/cmmdsTimeMachine running ...
cmmdsTimeMachine stopped.
cmmdsTimeMachine is starting on hostUuid xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
cmmdsTimeMachine started

Yet the stats reports it is starting for cmmdsTimeMachine:
/etc/init.d/cmmdsd status && /etc/init.d/epd status && /etc/init.d/clomd status && /etc/init.d/cmmdsTimeMachine status && /etc/init.d/osfsd status
cmmdsd is running
epd is running
clomd is running
cmmdsTimeMachine is not running
cmmdsTimeMachine is starting on hostUuid xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

 

cmmdsTimeMachine.log:
2025-02-17T15:44:25.781Z In(30) cmmdsTimeMachine[7593520]: cmmdsTimeMachine is starting on hostUuid xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
2025-02-17T15:45:20.688Z In(30) cmmdsTimeMachine[7593613]: cmmdsTimeMachine is starting on hostUuid xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

 

VMkernel.log:
2025-02-17T15:44:19.291Z In(182) vmkernel: cpu32:7593497)started from 'init' 2097676 with cmdline '/bin/init', parent 0
2025-02-17T15:44:19.291Z In(182) vmkernel: cpu32:7593497)uw.7593497 (38988727) requires 1244 KB, asked 1244 KB from cmmdsTimeMachine (3665) which has 39964 KB occupied and 996 KB available.

 

/var/run/log/vsanmgmt.log:
2025-02-17T15:56:11.058Z In(14) vsand[2100403]: [opID=4ebc50f8 VsanVcObjectHelper::wrapper] Ready to get single executor result for the key ['_CheckClomdLiveness', 'ha-vsan-health-system'] in timeout 15
2025-02-17T15:56:11.059Z In(14) vsand[2100403]: [opID=4ebc50f8 VsanVcObjectHelper::wrapper] Ready to call original function for execution
2025-02-17T15:56:11.060Z In(14) vsand[2100268]: [opID=4ebc50f8 VsanHealthSystemImpl::_CheckClomdLiveness] CLOM stats query is done
2025-02-17T15:56:11.060Z In(14) vsand[2100268]: [opID=4ebc50f8 VsanHealthSystemImpl::_CheckClomdLiveness] Get clomd up time 2154728342 with JSON size: 1014 bytes
2025-02-17T15:56:11.060Z In(14) vsand[2100268]: [opID=4ebc50f8 VsanObjectHelper::updateClomToolResult] Update minSpaceRequiredForVsanOp from 1143535042560 to 1036160860160
2025-02-17T15:56:11.061Z In(14) vsand[2100268]: [opID=4ebc50f8 VsanVcObjectHelper::AddCallResultAndNotify] delete the single concurrent execution key
2025-02-17T15:56:11.061Z In(14) vsand[2100403]: [opID=4ebc50f8 VsanVcObjectHelper::wrapper] Finish execute _CheckClomdLiveness
2025-02-17T15:56:11.061Z In(14) vsand[2100403]: [opID=4ebc50f8 VsanHealthSystemImpl::CheckHostDaemonHealth] Reload daemon pid cache for cmmdsTimeMachined and retry
2025-02-17T15:56:11.062Z In(14) vsand[2100403]: [opID=4ebc50f8 VsanVcObjectHelper::wrapper] Ready to get single executor result for the key ['_GetDaemonPid', 'ha-vsan-health-system', 'cmmdsTimeMachined', "/bin/pgrep -fl 'cmmdsTimeMachine' | awk '($2 ~ /python$/) { print $1 }' | head -n 1"] in timeout 10
2025-02-17T15:56:11.062Z In(14) vsand[2100403]: [opID=4ebc50f8 VsanVcObjectHelper::wrapper] Ready to call original function for execution
2025-02-17T15:56:11.076Z Er(11) vsand[2100495]: [opID=4ebc50f8 VsanHealthSystemImpl::_GetDaemonPid] Error happened when get daemon pid for cmmdsTimeMachined, stdout: , err: , ret: 0
2025-02-17T15:56:11.076Z In(14) vsand[2100495]: [opID=4ebc50f8 VsanVcObjectHelper::AddCallResultAndNotify] delete the single concurrent execution key
2025-02-17T15:56:11.076Z In(14) vsand[2100403]: [opID=4ebc50f8 VsanVcObjectHelper::wrapper] Finish execute _GetDaemonPid
2025-02-17T15:56:11.076Z Er(11) vsand[2100403]: [opID=4ebc50f8 VsanHealthSystemImpl::CheckHostDaemonHealth] Daemon cmmdsTimeMachined is not running
2025-02-17T15:56:11.077Z In(14) vsand[2100403]: [opID=4ebc50f8 VsanPyVmomiProfiler::log] Profiler:
2025-02-17T15:56:11.077Z In(14) vsand[2100403]: [opID=4ebc50f8 VsanPyVmomiProfiler::logProfile]   clomTool.getClomdStatsCommand: 0.00s, consumed: 118744KB (+0KB), consumedPeak: 121640KB (+0KB), effectiveMin: 174080KB (+0KB), effectiveMinPeak: 174080KB (+0KB), implicitMin: 143716KB (+0KB), requestedMinPeak: 154004KB (+0KB)

Environment

VMware vSAN 8.x

Cause

cmmdsTimeMachine Out of Memory issue happens in rare occasions when lot of DOM objects are updating.

Resolution

Changes were made to detect cmmdsTimeMachine MemoryError 
To resolve the issue upgrade vCenter to version vCenter Server 8.0 Update 3b or later and ESXi hosts to version ESXi 8.0.3 P04 or later.