Error: "Failed to establish connection to HBR" after upgrading the Replicator appliance in Cloud Director Availability 4.x
search cancel

Error: "Failed to establish connection to HBR" after upgrading the Replicator appliance in Cloud Director Availability 4.x

book

Article ID: 315059

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • After upgrading a Cloud Director Availability Replicator appliance, you see an error similar to:
Failed to establish connection to HBR
  • When you run the command journalctl -u replicator -b on the Replicator appliance, you see messages similar to:
Mar 04 14:10:16 Replicator hbrsrv.stdout [1025]: Failed to initialize database /etc/vmware/hbrsrv.103.db: Database schema version mismatch
Mar 04 14:10:16 Replicator systemd[1]: hbrsrv.service: Main process exited, code-exited, status-13/n/a
Mar 04 14:10:16 Replicator systemd[1]: hbrsrv.service: Failed with result 'exit-code'.
Mar 04 14:10:20 Replicator rsyslogd [483]: imjournal: journal files changed, reloading... [v8.2306.0 try https://www.rsyslog.com/e/0]
Mar 04 14:10:27 Replicator systemd[1]: hbrsrv.service: Scheduled restart job, restart counter is at 2.
Mar 04 14:10:27 Replicator systemd[1]: Stopped Host-based replication server..
Mar 04 14:10:27 Replicator systemd[1]: Starting Host-based replication server....
Mar 04 14:10:27 Replicator hbrarv-set-nic-config.sh [1127]: net.ipv4.tcp_rmem 4096 131072 33554432
Mar 04 14:10:27 Replicator systemd[1]: Started Host-based replication server..
Mar 04 14:10:27 Replicator hbrarv.stdout [1134]: Failed to initialize database /etc/vmware/hbrarv.103.db: Database schema version mismatch
Mar 04 14:10:27 Replicator systemd[1]: hbrarv.service: Main process exited, code=exited, status=13/n/a
Mar 04 14:10:27 Replicator systemd[1]: hbrarv.service: Failed with result 'exit-code'.
  • In the /opt/vmware/h4/replicator/log/replicator.log file on the Replicator appliance, you see entries similar to:
2024-03-04 14:16:11.076 WARN - [71a83d98-####-####-####-##########80] [hbr-poller3] c.v.h.r.m.hbr.HbrGroupCollector     : Unable to poll hbrsrv groups (will retry later)

com.vmware.exception.ConnectionRefusedException: https://127.0.0.1:8123 invocation failed with "org.apache.http.conn.HttpHostConnectException: Connect to 127.0.0.1:8123 [/127.0.0.1] failed: Connection refused"
    at com.vmware.exception.converter.ClientExceptionConverter.convertException(ClientExceptionConverter.java:45)
    at com.vmware.vlsi.util.ExceptionConverterInterceptor.handleException(ExceptionConverterInterceptor.java:30)
    at com.vmware.vim.vmomi.client.common.impl.ResponseImpl.setError(ResponseImpl.java:263)
    at com.vmware.vim.vmomi.client.http.impl.HttpExchangeBase.setResponseError(HttpExchangeBase.java:356)
    ...
  • In the /var/log/upgrade.log file on the Replicator appliance, you see entries similar to:
[2024-03-04 14:08:49 UTC] Attempting to do a database upgrade from version 102 
[2024-03-04 14:08:49 UTC] Upgrading the database from 102 to 103 
[2024-03-04 14:08:49 UTC] Creating a temporary hbrsrv database 
[2024-03-04 14:08:49 UTC] Checking all expected tables exist 
[2024-03-04 14:08:49 UTC] Checking all tables have the expected columns 
[2024-03-04 14:08:49 UTC] New version 103 database already exists in /etc/vmware/hbrsrv.103.db

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware Cloud Director Availability 4.x

Cause

This issue occurs due to a stale HBR database file in the /etc/vmware/ directory on the Replicator appliance.

Resolution

To resolve this issue, contact Broadcom Support and note this Article ID (315059) in the problem description. For more information, see Creating and managing Broadcom support cases.

Additional Information

For more information on taking a backup of a Cloud Director Availability appliance by using the backup archive feature, see the Backing up and restoring in the Cloud Director site section of the Cloud Director Availability documentation.