vCenter upgrade to 8.0 fails during stage 2 while starting vpxd-svcs service
search cancel

vCenter upgrade to 8.0 fails during stage 2 while starting vpxd-svcs service

book

Article ID: 324587

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • In the Upgrade UI, you will see an error similar to "An error occurred while starting service 'vpxd-svcs'"
  • During the vpxd-svcs service start, it fails to connect to the database. In the var/log/vmware/vpxd-svcs/vpxd-svcs.log you may find entries similar to:

    YYYY-MM-DDTHH:MM:SS.234Z [main [] ERROR com.vmware.cis.core.kv.impl.Provider.VCDBProviderFactory  opId=] Unable to get database connection:
    org.postgresql.util.PSQLException: The server requested password-based authenticatin, but no password was provided by plugin null

  • From the vPostgres logs, you can see that the role "vpxd_svcs_kv_store" does not exist. In /var/log/vmware/vpostgres/postgresql.log, you will find entries similar to:

YYYY-MM-DDTHH:MM:SS.234 UTC 652d07f4.560c 0 VCDB vpxd_svcs_kv_store [local] 22028 2 FATAL:  password authentication failed for user "vpxd_svcs_kv_store"
YYYY-MM-DDTHH:MM:SS.234 UTC 652d07f4.560c 0 VCDB vpxd_svcs_kv_store [local] 22028 3 DETAIL:  Role "vpxd_svcs_kv_store" does not exist.
        Connection matched pg_hba.conf line 7: "local all all md5"
 

  • During the vpxd-svcs prestart, the DB role creation fails. In var/log/vmware/vpxd-svcs/pre-start-vpxd-svcs.log, you will find entries similar to:

YYYY-MM-DDTHH:MM:SS.294Z INFO     Executing vpxd-svcs pre start commands.
YYYY-MM-DDTHH:MM:SS.451Z DEBUG    Executing cmd : /usr/bin/python /usr/lib/vmware/site-packages/vsr/db_tool/vpg_sync_registry.py --registry /etc/vmware/service-registry/vpxd-svcs-sub-registry.yaml -U postgres --host /var/run/vpostgres --operation install --instance vpostgres --stage roles --stage grants --database VCDB
YYYY-MM-DDTHH:MM:SS.698Z INFO     Failure to create vpxd-svcs pg role, stdout : b'' | error : b'2023-10-16 09:52:48,685 ERROR : during Generation of the diff: Invalid identifier: "vcenterVA_RO"\n'
 

  • In the /var/log/vmware/lookupsvc/prestart.log, errors similar to the below lines during lookupservice firstboot where the DB role creation

INFO:__main__:Executing lookupsvc prestart script
INFO:__main__:Failure granting permissions to lookupsvc VCDB role. stdout :  | error : YYYY-MM-DD HH:MM:SS,744 ERROR : during Generation of the diff: Invalid identifier: "vcenterVA_RO"
 

  • In the /var/log/vmware/vpostgres/postgresql.log, entries similar to the lines below are found:

YYYY-MM-DD HH:MM:SS.055 UTC 652d07c6.47d3 0 VCDB lookupsvc_sync_db [local] 18387 2 FATAL:  password authentication failed for user "lookupsvc_sync_db"
YYYY-MM-DD HH:MM:SS.055 UTC 652d07c6.47d3 0 VCDB lookupsvc_sync_db [local] 18387 3 DETAIL:  Role "lookupsvc_sync_db" does not exist.
        Connection matched pg_hba.conf line 7: "local all all md5"
YYYY-MM-DD HH:MM:SS.055 UTC 652d07c6.47d3 0 VCDB lookupsvc_sync_db [local] 18387 4 LOG:  could not send data to client: Broken pipe

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware vCenter Server 8.x

Cause

This issue is caused due to a custom role in the vCenter database with an unsupported format. Supported format for database role name is [A-Za-z0-9_] alphanumeric and underscore. Any other special characters in the name will cause this issue.

Note: Creating custom roles in the vCenter database is not recommended.

Additional information:
Even though the DB role name "vcenterVA_RO" in this example looks supported, we will still have this issue because of the way the upper-case letters are handled.
If we encounter this issue during the upgrade to 8.0U2, we may not see the "Invalid identifier" error in the logs, but this article is still valid.

Resolution


To resolve the issue, you have to either drop the role or rename it to a supported format.

To drop the role, follow the steps below.

  1. Connect to the vCenter database using the command below:
          /opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres
  2. List the database roles using the command below
                  \du
    Note: Here, we should see the custom database role that was causing the issue.
  3. Drop the role from the database using the command below:
                  DROP ROLE "vcenterVA_RO1";

    Note: If the role has privileges on the database, then the drop role command will not succeed. To drop the role, we need to remove all the privileges associated with the role.

  4. There are some third-party solutions(SAS, Movere) that are configured with read-only access to the vCenter database. During their configuration, the following commands are used to create the custom roles

    CREATE ROLE "vcenterVA_RO1" login password 'my_password';
    GRANT CONNECT ON DATABASE "VCDB" TO "vcenterVA_RO1";
    GRANT USAGE ON SCHEMA vc TO "vcenterVA_RO1";
    GRANT SELECT ON ALL TABLES IN SCHEMA vc TO "vcenterVA_RO1"

  5. In this case, execute the commands below to drop the role 

    CREATE ROLE "vcenterVA_RO1" login password 'my_password';
    GRANT CONNECT ON DATABASE "VCDB" TO "vcenterVA_RO1";
    GRANT USAGE ON SCHEMA vc TO "vcenterVA_RO1";
    GRANT SELECT ON ALL TABLES IN SCHEMA vc TO "vcenterVA_RO1"

Note: Adjust the database role name accordingly.

To rename the Role:

  1. If dropping the role is not an option, use the command below to rename it.

             ALTER ROLE "vcenterVA_RO" TO "vcentervaro";

Note: After renaming the role, ensure to update the third-party solutions with the new name.