vCenter server services fail to start, vpxd-svcs operation timed out.
search cancel

vCenter server services fail to start, vpxd-svcs operation timed out.

book

Article ID: 410324

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • The article explain a rare corner case where vCenter vpxd-svcs service fails to start due to wrong permissions on /var/log/vmware/vtsdb/postgresql.log.
  • Example below is showing failure to start vCenter services.
    # service-control --all --start
    Operation not cancellable. Please wait for it to finish...
    Performing start operation on service lwsmd...
    Successfully started service lwsmd
    Performing start operation on service vmafdd...
    Successfully started service vmafdd
    Performing start operation on service vmdird...
    Successfully started service vmdird
    Performing start operation on service vmcad...
    Successfully started service vmcad
    Performing start operation on profile: ALL...
    Successfully started service vmware-vmon
    Service-control failed. Error: Failed to start services in profile ALL. RC=1, stderr=Failed to start sps, imagebuilder, vtsdb, vsan-health, vlcm, vpxd-svcs services. Error: Operation timed out
  • Validate the service failure by running the below commands 
     grep '<service-name>' /var/log/vmware/vmon/vmon.log | grep -ivE 'health|counter'
     grep -B20 'Service pre-start command failed' /var/log/vmware/vmon/vmon.log
  • In the vmon.log, you may observe that vpxd-svcs repeatedly times out during startup.

    YYYY-MM-DDTHH:MM:SS In(05) host-58192 Adding service vpxd-svcs.
    YYYY-MM-DDTHH:MM:SS In(05) host-58192 <vpxd-svcs-prestart> Constructed command: /usr/bin/python /usr/lib/vmware-vpxd-svcs/scripts/linux/pre-start/main.py /storage /var/log
    YYYY-MM-DDTHH:MM:SS In(05) host-58192 <vpxd-svcs> Service pre-start command completed successfully.
    ...
    YYYY-MM-DDTHH:MM:SS In(05) host-58192 <vpxd-svcs> Service start operation timed out.
    YYYY-MM-DDTHH:MM:SS Wa(03) host-58192 <vpxd-svcs> Found empty StopSignal parameter in config file. Defaulting to SIGTERM
    YYYY-MM-DDTHH:MM:SS Wa(03) host-58192 <vpxd-svcs> Service exited. Exit code 143
  • Additionally, the file /var/log/vmware/vtsdb/vtsdb-runtime.log.stderr may report:
    YYYY-MM-DDTHH:MM:SS UTC FATAL: could not open log file "/var/log/vmware/vtsdb/postgresql.log": Permission denied

Environment

vCenter 7.x
vCenter 8.x

Cause

  • Multiple conditions can lead to vpxd-svcs timeout issues. In this specific case:
    • STIG hardening changes were applied.
  • A failed attempt was made to modify vPostgres log rotation.
  • During the upgrade, permissions on /var/log/vmware/vtsdb/ were corrupted or reset.
  • As a result, the vpxd-svcs process could not update postgresql.log, leading to service startup failures.

Resolution

To resolve this issue, restore the correct file permissions on /var/log/vmware/vtsdb/postgresql.log.

  • A healthy vCenter Server shows permissions similar to:
    # ls -la /var/log/vmware/vtsdb/postgresql.log
    -rw------- 1 vtsdbuser users 7837755 Sep 14 21:36 /var/log/vmware/vtsdb/postgresql.log
  • To correct the file permissions, run the below.
    # chmod 600 /var/log/vmware/vtsdb/postgresql.log
  • After correcting the permissions, restart the vCenter services.