Error: "CRITICAL PROBLEM" Cluster Health status reported and repmgr fails to connect to the database
search cancel

Error: "CRITICAL PROBLEM" Cluster Health status reported and repmgr fails to connect to the database

book

Article ID: 429862

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • The VAMI (https://<vCD cell address>:5480) UI on the VMware Cloud Director(VCD) appliance shows the health state as 'CRITICAL PROBLEM'.
  • There was an error getting the appliance's embedded database availability status.
  • The VAMI also shows a warning message stating:

    'There was an error getting the appliance's embedded database availability status'
  • The Cloud Director UI and API's are online and functional.
  • The command 'repmgr cluster show' when run on the VCD cell command line reports the following response:

ERROR: connection to database failed
DETAIL:
connection to server at "<ip address>", port 5432 failed: No route to host
Is the server running on that host and accepting TCP/IP connections?

DETAIL! attempted to connect using:
user=repmgr connect_timeout=2 dbname=repmgr host=<ip address> gssencmode=disable fallback_application_name=search path=

  • The IP address reported by the  'repmgr cluster show' command is not matching the IP address configured on eth1 of the VCD database cell.

Environment

VMware Cloud Director 10.6.1.x

Cause

This issue will occur if there is an incorrect IP address entered into the /opt/vmware/vpostgres/14/etc/repmgr.conf file in the database cell. Such a misconfiguration is most likely to occur if an attempt was made to change the VCD cell IP addresses post deployment of the VCD cell.
 
repmgrd ("replication manager daemon") is a management and monitoring daemon which runs on each node in a database cluster and is used to report the status of the database cluster to the VAMI UI. If an incorrect IP address is present in the configuration file then it prevents repmgrd from functioning.

Resolution

To resolve this issue identify the IP address currently in use on the eth1 interface of the database cell and ensure this matches the IP address present in the /opt/vmware/vpostgres/14/etc/repmgr.conf file.

  1. Login to the impacted database cell via SSH with the root user account.
  2. To confirm the eth1 address run this command:

    ip a | grep eth1

  3. To confirm what value is set in the repmgr.conf file run this command:

    grep -i conninfo  /opt/vmware/vpostgres/14/etc/repmgr.conf

  4. Take a backup of the repmgr.conf file.

    cp /opt/vmware/vpostgres/14/etc/repmgr.conf /opt/vmware/vpostgres/14/etc/repmgr.conf.backup

  5. Open the repmgr.conf file for editing with vi.

    vi /opt/vmware/vpostgres/14/etc/repmgr.conf

  6. Modify the IP address found on the line starting with  'conninfo' . Ensure it matches the current IP address of the eth1 interface identified earlier in step 2.
  7. Save the file.

  8. Restart the repmgrd deamon with this command:

    systemctl restart repmgrd

  9. Run this command to confirm that repmgrd is reporting the cluster status:

    repmgr cluster show

  10. Access the VAMi UI and refresh the page. Confirm the health status of the cluster is now reporting correctly.

    Note: You may need to wait up to 60 seconds for the health status to update in the VAMI.

  11. If the database node is within a database HA cluster, then review repmgr.conf on all additional nodes in the cluster to ensure the correct IP address is set on all. If required, repeat steps 1 - 9 on those additional database nodes.