vmware-vsan-health service on vCenter Server fails to start or remain running.An error occurred while starting service 'vsan-health'
Invalid vCenter Server Status: All required services are not up! Stopped services: 'vsan-health'
This article helps you identify which specific article applies to your situation. Use the log evidence and symptoms in the table below to find the correct resolution.
| vCenter Version Affected | Article | Key Symptoms | Distinguishing Notes |
|---|---|---|---|
| 6.7 | vSAN-Health Service fails to start | Service fails to start; VAMI backup and vSphere Replication fail with 503; PermissionError: [Errno 1] Operation not permitted in /var/log/vmware/vsan-health/vmware-vsan-health-runtime.log.stderr | Most likely after upgrading to 6.7 U3 with vsan-health disabled before the upgrade; Site Recovery Manager operations also affected |
| 6.7 | vmware-vsan-health service fails to start after update to vCenter 6.7U3 with Error : Validation of VMOMI server version got from provider failed | Service fails to start after update to 6.7 U3; VAMI backup fails; FileIONotFoundExceptionE for /etc/vmware-vsan-health/.cns_pgpass in /var/log/vmware/vsan-health/vmware-vsan-health-service.log; Validation of VMOMI server version got from provider failed in /var/log/vmware/vmware-sps/sps.log | Non-VCHA environments; same underlying .cns_pgpass cause as 326836 but different trigger context |
| 6.7 | vSAN Health service fails to start after performing vCenter upgrade | Service fails to start after upgrade in a VCHA configuration; VASA/SMS provider registration fails; adding a VM disk fails with a PBM error; ProviderRegistrationFault in /var/log/vmware/vsan-health/vsanvp.log; .cns_pgpass error in /var/log/vmware/vsan-health/vmware-vsan-health-service.log | Specific to VCHA configurations; same underlying .cns_pgpass cause as KB 316413 |
| 8.0.x | vmware-vsan-health service fails to start | FATAL: password authentication failed for user "cns" | Service fails to start and dumps core; core.vsanvcmgmtd files appear under /var/core; FATAL: password authentication failed for user "cns" in /var/log/vmware/vsan-health/vsanvcmgmtd.log; User "cns" has no password assigned in /var/log/vmware/vpostgres/postgresql.log | CNS user missing from pg_hba.conf or has no password set in VCDB |
| 8.0.x | vmware-updatemgr and vmware-vsan-health services fails to start on vCenter Server 8.0 | vSAN Health tab reports service not running; service repeatedly fails to log into VC; cannot bind '0.0.0.0:80': Address already in use in /var/log/vmware/envoy/envoy.log; Failed to log into VC in /var/log/vmware/vsan-health/vmware-vsan-health-service.log | httpd process is holding port 80 and preventing Envoy from binding it; vmware-updatemgr service also affected |
| 8.0.x | vSAN-Health Service times out and fails to start on vCenter server | Service times out on start; vCenter is using an external CA-signed SSL certificate; sslv3 alert certificate expired in /var/log/vmware/vpxd/vpxd.log; code: 526 (Invalid SSL Certificate) in /var/log/vmware/vsan-health/vsanvcmgmtd.log | Broken or expired root/intermediate CA certificate chain in /etc/vmware-vpx/ssl/rui.crt; only occurs when using custom CA-signed certs |
| 8.0.x | vmware-vsan service stops abruptly and fails to start with the following error "An error occurred while starting service '%(0)s'" | Service stops abruptly and fails to restart; envoy overloaded in /var/log/vmware/vpxd-svcs/vpxd-svcs.log; high count of 503 overload entries in /var/log/vmware/envoy/envoy-access-* | Memory exhaustion in envoy-sidecar; distinguishable from KB 395218 by the envoy overloaded message; fixed in vCenter Server 8.0 U3g |
| 8.0.x | vSAN Health Service will not start and does not display any of the vSAN health/monitoring or administration insider of the vCenter GUI | Service fails to start after a restart or reboot; vSAN views absent from vSphere Client; PermissionError: [Errno 13] Permission denied on vmware-vsan-health.pid in /var/log/vmware/vsan-health/vmware-vsan-health.log | Stale .pid file from a previous service run; no upgrade trigger; distinct from 318854 which is Errno 1 in a different log file |