HCX upgrade failed with error "psql: error: could not connect to server: No such file or directory"
search cancel

HCX upgrade failed with error "psql: error: could not connect to server: No such file or directory"

book

Article ID: 417623

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

  • HCX upgrade "completed", and the appliance rebooted.
  • After waiting 15+ minutes the HCX services are still unavailable
  • It is possible to ssh in to the HCX Connector appliance as "admin"
  • Running ccli (which depends on other HCX services being fully operational) returns the following error stack:
    Welcome to HCX Central CLI
    Failed refreshing configuration file.
    -------- Output from /opt/vmware/bin/ccliSetup.pl --------
    Reading CGW information from DB...
    psql: error: could not connect to server: No such file or directory
            Is the server running locally and accepting
            connections on Unix domain socket "/run/postgresql/.s.PGSQL.5432"?
    psql: error: could not connect to server: No such file or directory
            Is the server running locally and accepting
            connections on Unix domain socket "/run/postgresql/.s.PGSQL.5432"?
    malformed JSON string, neither array, object, number, string or atom, at character offset 0 (before "(end of string)") at /opt/vmware/bin/ccliSetup.pl line 52.
    ----------------------------------------------------------
    Error Details: /opt/vmware/bin/ccliSetup.pl: exit status 1
    Err: config file /root/.ccli doesn't exist.
    root@hcx-connector-appliance [ ~ ]#

Environment

VMware HCX

Cause

After the appliance automatically reboots during an upgrade, the upgrade is still in-progress. It takes time depending on the size of the HCX installation, to complete the upgrade and bring up services.

Resolution

Be patient. A typical HCX Manager/Connector appliance upgrade will take approximately 15-30 minutes to complete. If there are a large number of Service Mesh appliances and/or a large database it could take more time. If the issue doesn't self resolve, collect support logs and open a case with Broadcom Support for assistance in resolving this issue.

Additional Information

If you are concerned about the amount of time the HCX upgrade is taking, please confirm the service status's of the below services after you SSH into the HCX Manager: Check the service status from ssh using the following command:

$ systemctl list-units --type=service --state=running

The below services need to be in a running state:

  UNIT                         LOAD   ACTIVE SUB     DESCRIPTION
  app-engine.service           loaded active running App-Engine
  appliance-management.service loaded active running Appliance Management
  plan-engine.service          loaded active running Migration Planner Engine
  postgresdb.service           loaded active running PostgresDB
  web-engine.service           loaded active running WebEngine
  zookeeper.service            loaded active running Zookeeper


Check the upgrade logs in the following location for progress and additional details:

/common/logs/upgrade/upgrade.log

<timestamps> UTC Validating the upgrade bundle .............................................................................. [   OK ]
<timestamps> UTC Backing-up certs, configs and database before the upgrade .................................................. [   OK ]
<timestamps> UTC Stopping all the services .................................................................................. [   OK ]
<timestamps> UTC Extracting distribution bundle ............................................................................. [   OK ]
<timestamps> UTC Installing the upgrade image ............................................................................... [   OK ]
<timestamps> UTC Updating each components version ........................................................................... [   OK ]
<timestamps> UTC Upgrade successful, restarting the HCX Manager VM .......................................................... [   OK ]