Can not see Skyline Health for all vSAN clusters in the vCenter. By checking the vCenter, the service vsan-health can not be started despite that others are all running. From /var/log/vmware/vmon/vmon.log we see such logs,
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Constructed command: /usr/sbin/vsanvcmgmtd -c /usr/lib/vmware-vpx/vsan-health/VsanVcMgmtConfig.xml -u /etc/vmware-vsan-health/VsanMgmtCustomizedConfig.xml
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Running the API Health command as user root
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-vpx/vsan-health/vsanhealth-vmon-apihealth.py
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Re-check service health since it is still initializing.
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Running the API Health command as user root
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-vpx/vsan-health/vsanhealth-vmon-apihealth.py
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Re-check service health since it is still initializing.
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Running the API Health command as user root
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-vpx/vsan-health/vsanhealth-vmon-apihealth.py
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Re-check service health since it is still initializing.
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Running the API Health command as user root
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-vpx/vsan-health/vsanhealth-vmon-apihealth.py
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Re-check service health since it is still initializing.
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Running the API Health command as user root
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-vpx/vsan-health/vsanhealth-vmon-apihealth.py
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Re-check service health since it is still initializing.
......
......
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-vpx/vsan-health/vsanhealth-vmon-apihealth.py
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Re-check service health since it is still initializing.
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| I005: <vsan-health> Service start operation timed out.
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| W003: <vsan-health> Found empty StopSignal parameter in config file. Defaulting to SIGTERM
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| W003: <vsan-health> Service exited. Exit code 1
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| E002: Service batch op START failed. Failed services: 'vsan-health'
YYYY-MM-DDTHH:mm:ss.xxx| host-xxxx| E002: Services vsan-health failed to start on vMon startup. ErrCode 1
vsan-health log file /var/log/vmware/vsan-health/vmware-vsan-health-service.log doesn't contain any log related to vsan-health start, and the timestamp of the last log entry is a past date.
1. In the first case, delete the file /var/log/vmware/vsan-health/vmware-vsan-health-service.log
2. In the second case, delete the excessive log files under /var/log/vmware/vsan-health/. However, using a command like "rm -xxx-*.log" directly might result in an error such as "bash: /usr/bin/rm: Argument list too long". If this happens, we may need to filter the files and delete them in smaller batches, for example by using commands like "rm xxx-*0.log", "rm xxx-*1.log", and so on.
After above actions, execute the command
service-control --start vsan-health
Once the service is started, go back to vCenter web client and refresh the page, the Skyline Health should be ok now.