vSAN Health Check reporting multiple failed tests that rely on vsanmgmtd
search cancel

vSAN Health Check reporting multiple failed tests that rely on vsanmgmtd

book

Article ID: 317829

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:
In the vSphere UI managing a vSAN cluster you observe one or more of the following vSAN Health checks failing for one or more hosts:
  • One or more of those alarms:
    • ESXi vSAN Health service installation
    • Advanced vSAN configuration in sync
    • Hosts with connectivity issues
    • All hosts contributing stats
    • Performance data collection
    • Stats master election
    • Physical disk health retrieval issues
  • In vsanmgmt.log messages similar to the following are observed:
2018-11-26T03:20:58Z VSANMGMTSVC: INFO vsanperfsvc[MainThread] [statsdaemon::_logDaemonMemoryStats] Daemon memory stats: eMin=212.216MB, eMinPeak=215.040MB, rMinPeak=217.128MB  MEMORY PRESSURE
  • In vmkernel.log messages similar to the following are observed:
2018-11-26T03:20:03.759Z cpu0:333127680)WARNING: CMMDS: CMMDSArenaMemMapToUser:153: Failed to map MPNs to world 333127680: Out of memory


Environment

VMware vSAN 6.x

Cause

The various failed vSAN Health checks are generally false positives due to vsanmgmtd (which Health and Performance stats rely on) being in a non-functional or impaired state. This issue can be caused by the vsanmgmtd running out of allocated memory resources

Resolution

This is a known issue fixed in ESXi 6.5 P03.

Build and Download details can be found in the release notes here:
VMware ESXi 6.5, Patch Release ESXi650-201810002

This issue is also fixed in ESXi 6.7 U1.

Upgrade both vCenter and all hosts in the cluster to build 10884925 or higher. It is highly recommended that you upgrade to the latest available 6.5 U3 or 6.7 U3 patch if possible.

Workaround:
As a temporary solution, restarting vsanmgmtd and the management agents may resolve the issue.

Login to the ESXi CLI via Console or SSH as root and run the below commands on all hosts that are currently impacted:

$ /etc/init.d/vsanmgmtd stop
$ /etc/init.d/vsanmgmtd start

$ /etc/init.d/hostd stop
$ /etc/init.d/hostd start

$ /etc/init.d/vpxa stop
$ /etc/init.d/vpxa start


Additional Information

Impact/Risks:
  • Any vSAN Health information that requires vsanmgmtd will not be checked on the impacted hosts.
  • Display and persistence of vSAN performance stats from the impacted hosts will fail if remote collection to the stats master is non-functional.
  • Display and persistence of vSAN performance stats for all hosts in the cluster will not be possible if there is no current stats master.
  • Issues with vsanmgmtd can cause hosts not to be able to enter Maintenance Mode.