Multiple vCenter services fail to start
search cancel

Multiple vCenter services fail to start

book

Article ID: 324586

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • Multiple vCenter services will fail to start

  • This issue can happen in the following scenarios:
    • Rebooting the vCenter
    • Reverting snapshot on the vCenter
    • Restoring the vCenter from a backup
    • Updating/Patching the vCenter

  • In the /var/log/vmware/cloudvm/service-control.log, we will see the below status of the services, where the majority of the services are in a stopped state.
<YYYY-MM-DD>T<time> INFO service-control Running:
lookupsvc lwsmd vmafdd vmcad vmdird vmonapi vmware-envoy vmware-postgres-archiver vmware-rhttpproxy vmware-statsmonitor vmware-stsd vmware-trustmanagement vmware-vmon vmware-vpostgres vtsdb
<YYYY-MM-DD>T<time> INFO service-control Stopped:
applmgmt observability observability-vapi pschealth vlcm vmcam vmware-analytics vmware-certificateauthority vmware-certificatemanagement vmware-cis-license vmware-content-library vmware-eam vmware-hvc vmware-imagebuilder vmware-infraprofile vmware-netdumper vmware-perfcharts vmware-pod vmware-rbd-watchdog vmware-sca vmware-sps vmware-topologysvc vmware-updatemgr vmware-vapi-endpoint vmware-vcha vmware-vdtc vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-ui vstats wcp
  • In the /var/log/vmware/vmon/vmon.log, we will see below entries for the services that fail to start:
<YYYY-MM-DD>T<time> In(05) host-21267 <applmgmt> Service start operation timed out.
<YYYY-MM-DD>T<time> In(05) host-21267 <eam> Service start operation timed out.
  • In the /var/log/vmware/applmgmt/applmgmt.log, we will see entries similar to:
<YYYY-MM-DD>T<time> AM UTC [22073]ERROR:vmware.vherd.transport.server:Cannot listen: Couldn't listen on ::1:8201: [Errno 99] Cannot assign requested address.
  • In the /var/log/vmware/eam/eam.log, we will see entries similar to:
<YYYY-MM-DD>T<time> | ERROR | vim-monitor | VcListener.java | 124 | An unexpected error in the changes polling loop
com.vmware.eam.EamRemoteSystemException: Client error communicating with the vCenter server.
Caused by: com.vmware.vim.vmomi.client.common.UnexpectedStatusCodeException: Unexpected status code: 503
<YYYY-MM-DD>T<time> |  INFO | vim-monitor | VcListener.java | 125 | Full stack trace: com.vmware.eam.EamRemoteSystemException: Client error communicating with the vCenter server.
Caused by: com.vmware.vim.vmomi.client.common.UnexpectedStatusCodeException: Unexpected status code: 503
  • For vpxd-svcs service, in the /var/log/vmware/vmon/vmon.log we will find below entries:   
<YYYY-MM-DD>T<time> Wa(03) host-#### <vpxd-svcs> Service pre-start command's stderr: Traceback (most recent call last):
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/vmware/site-packages/syncGrpcUtil.py", line 64, in DoGet
<YYYY-MM-DD>T<time> Wa(03)+ host-####     response = urllib.request.urlopen(url)
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/python3.7/urllib/request.py", line 222, in urlopen
<YYYY-MM-DD>T<time> Wa(03)+ host-####     return opener.open(url, data, timeout)
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/python3.7/urllib/request.py", line 531, in open
<YYYY-MM-DD>T<time> Wa(03)+ host-####     response = meth(req, response)
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
<YYYY-MM-DD>T<time> Wa(03)+ host-####     'http', request, response, code, msg, hdrs)
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/python3.7/urllib/request.py", line 569, in error
<YYYY-MM-DD>T<time> Wa(03)+ host-####     return self._call_chain(*args)
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
<YYYY-MM-DD>T<time> Wa(03)+ host-####     result = func(*args)
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/python3.7/urllib/request.py", line 649, in http_error_default
<YYYY-MM-DD>T<time> Wa(03)+ host-####     raise HTTPError(req.full_url, code, msg, hdrs, fp)
<YYYY-MM-DD>T<time> Wa(03)+ host-#### urllib.error.HTTPError: HTTP Error 403: Forbidden
<YYYY-MM-DD>T<time> Wa(03)+ host-####
<YYYY-MM-DD>T<time> Wa(03)+ host-#### During handling of the above exception, another exception occurred:
<YYYY-MM-DD>T<time> Wa(03)+ host-####
<YYYY-MM-DD>T<time> Wa(03)+ host-#### Traceback (most recent call last):
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/vmware-vpxd-svcs/scripts/linux/pre-start/tagging_grpc_registration.py", line 70, in addEnvoyRoutingForTopology
<YYYY-MM-DD>T<time> Wa(03)+ host-####     addSyncableServiceCluster("tagging-grpc-cluster", 4004)
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/vmware/site-packages/syncGrpcUtil.py", line 93, in addSyncableServiceCluster
<YYYY-MM-DD>T<time> Wa(03)+ host-####     tag = DoGet(uri_clusters + 'all')
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/vmware/site-packages/syncGrpcUtil.py", line 68, in DoGet
<YYYY-MM-DD>T<time> Wa(03)+ host-####     log_error(err.code, err.reason, err.headers)
<YYYY-MM-DD>T<time> Wa(03)+ host-#### TypeError: log_error() takes from 1 to 2 positional arguments but 3 were given
<YYYY-MM-DD>T<time> Wa(03)+ host-####
<YYYY-MM-DD>T<time> Wa(03)+ host-#### During handling of the above exception, another exception occurred:
<YYYY-MM-DD>T<time> Wa(03)+ host-####
<YYYY-MM-DD>T<time> Wa(03)+ host-#### Traceback (most recent call last):
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/vmware-vpxd-svcs/scripts/linux/pre-start/main.py", line 100, in <module>
<YYYY-MM-DD>T<time> Wa(03)+ host-####     endpoint_registration_runner(logging_file)
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/vmware-vpxd-svcs/scripts/linux/pre-start/main.py", line 65, in endpoint_registration_runner
<YYYY-MM-DD>T<time> Wa(03)+ host-####     UpdateTaggingServiceGrpcEndpoint(logging_file).run()
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/vmware-vpxd-svcs/scripts/linux/pre-start/tagging_grpc_registration.py", line 55, in run
<YYYY-MM-DD>T<time> Wa(03)+ host-####     self.addEnvoyRoutingForTopology()
<YYYY-MM-DD>T<time> Wa(03)+ host-####   File "/usr/lib/vmware-vpxd-svcs/scripts/linux/pre-start/tagging_grpc_registration.py", line 76, in addEnvoyRoutingForTopology
<YYYY-MM-DD>T<time> Wa(03)+ host-####     raise Exception("Failed to add tagging grpc routing to envoy"
<YYYY-MM-DD>T<time> Wa(03)+ host-#### Exception: Failed to add tagging grpc routing to envoy while executing vpxd-svcs prestart commands
<YYYY-MM-DD>T<time> Wa(03)+ host-####
<YYYY-MM-DD>T<time> Er(02) host-#### <vpxd-svcs> Service pre-start command failed with exit code 1.


Note: The same error stack can be found in /var/log/vmware/vpxd-svcs/pre-start-vpxd-svcs.log

Environment

VMware vCenter Server 7.x

Cause

  • This issue occurs when the vCenter server is configured with a proxy server and the loopback address(127.0.0.1) is not listed in NO_PROXY list. In this scenario, all communications between the services will go via the proxy server and get rejected.

Resolution

  • Examine the proxy configuration of the vCenter server using the command below

cat /etc/sysconfig/proxy

Example configuration of a vCenter with no proxy server:

root@vcsa01 [ ~ ]# cat /etc/sysconfig/proxy
# Enable a generation of the proxy settings to the profile.
# This setting allows to turn the proxy on and off while
# preserving the particular proxy setup.
#
PROXY_ENABLED="no"

# Some programs (e.g. wget) support proxies, if set in
# the environment.
# Example: HTTP_PROXY="http://proxy.provider.de:3128/"
HTTP_PROXY=""

# Example: HTTPS_PROXY="https://proxy.provider.de:3128/"
HTTPS_PROXY=""

# Example: FTP_PROXY="http://proxy.provider.de:3128/"
FTP_PROXY=""

# Example: GOPHER_PROXY="http://proxy.provider.de:3128/"
GOPHER_PROXY=""

# Example: SOCKS_PROXY="socks://proxy.example.com:8080"
SOCKS_PROXY=""

# Example: SOCKS5_SERVER="office-proxy.example.com:8881"
SOCKS5_SERVER=""

# Example: NO_PROXY="www.me.de, do.main, localhost"
NO_PROXY="localhost, 127.0.0.1"



Note: The NO_PROXY parameter in this configuration file defines for which addresses the proxy is bypassed. By default localhost and 127.0.0.1 will be listed here to bypass proxy server for all local communications.

  • The issue in this article will occur if the NO_PROXY parameter is missing from the configuration or does not include localhost and 127.0.0.1

To resolve the issue, follow the procedure below:

  • Take a backup of /etc/sysconfig/proxy
  • Edit /etc/sysconfig/proxy and ensure we have the line below

NO_PROXY="localhost, 127.0.0.1"

Note: There could be other addresses configured for this NO_PROXY parameter to bypass the proxy. Those can be retained

  • Save the file
  • Restart all the services or reboot the vCenter