Error: "Failed to establish connection to HBR" after upgrading the Replicator appliance in Cloud Director Availability 4.x
searchcancel
Error: "Failed to establish connection to HBR" after upgrading the Replicator appliance in Cloud Director Availability 4.x
book
Article ID: 315059
calendar_today
Updated On: 08-26-2024
Products
VMware Cloud Director
Issue/Introduction
After upgrading a Cloud Director Availability Replicator appliance, you see an error similar to:
Failed to establish connection to HBR
When you run the command journalctl -u replicator -b on the Replicator appliance, you see messages similar to:
Mar 04 14:10:16 Replicator hbrsrv.stdout [1025]: Failed to initialize database /etc/vmware/hbrsrv.103.db: Database schema version mismatch Mar 04 14:10:16 Replicator systemd[1]: hbrsrv.service: Main process exited, code-exited, status-13/n/a Mar 04 14:10:16 Replicator systemd[1]: hbrsrv.service: Failed with result 'exit-code'. Mar 04 14:10:20 Replicator rsyslogd [483]: imjournal: journal files changed, reloading... [v8.2306.0 try https://www.rsyslog.com/e/0] Mar 04 14:10:27 Replicator systemd[1]: hbrsrv.service: Scheduled restart job, restart counter is at 2. Mar 04 14:10:27 Replicator systemd[1]: Stopped Host-based replication server.. Mar 04 14:10:27 Replicator systemd[1]: Starting Host-based replication server.... Mar 04 14:10:27 Replicator hbrarv-set-nic-config.sh [1127]: net.ipv4.tcp_rmem 4096 131072 33554432 Mar 04 14:10:27 Replicator systemd[1]: Started Host-based replication server.. Mar 04 14:10:27 Replicator hbrarv.stdout [1134]: Failed to initialize database /etc/vmware/hbrarv.103.db: Database schema version mismatch Mar 04 14:10:27 Replicator systemd[1]: hbrarv.service: Main process exited, code=exited, status=13/n/a Mar 04 14:10:27 Replicator systemd[1]: hbrarv.service: Failed with result 'exit-code'.
In the /opt/vmware/h4/replicator/log/replicator.log file on the Replicator appliance, you see entries similar to:
2024-03-04 14:16:11.076 WARN - [71a83d98-####-####-####-##########80] [hbr-poller3] c.v.h.r.m.hbr.HbrGroupCollector : Unable to poll hbrsrv groups (will retry later)
com.vmware.exception.ConnectionRefusedException: https://127.0.0.1:8123 invocation failed with "org.apache.http.conn.HttpHostConnectException: Connect to 127.0.0.1:8123 [/127.0.0.1] failed: Connection refused" at com.vmware.exception.converter.ClientExceptionConverter.convertException(ClientExceptionConverter.java:45) at com.vmware.vlsi.util.ExceptionConverterInterceptor.handleException(ExceptionConverterInterceptor.java:30) at com.vmware.vim.vmomi.client.common.impl.ResponseImpl.setError(ResponseImpl.java:263) at com.vmware.vim.vmomi.client.http.impl.HttpExchangeBase.setResponseError(HttpExchangeBase.java:356) ...
In the /var/log/upgrade.log file on the Replicator appliance, you see entries similar to:
[2024-03-04 14:08:49 UTC] Attempting to do a database upgrade from version 102 [2024-03-04 14:08:49 UTC] Upgrading the database from 102 to 103 [2024-03-04 14:08:49 UTC] Creating a temporary hbrsrv database [2024-03-04 14:08:49 UTC] Checking all expected tables exist [2024-03-04 14:08:49 UTC] Checking all tables have the expected columns [2024-03-04 14:08:49 UTC] New version 103 database already exists in /etc/vmware/hbrsrv.103.db
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
Environment
VMware Cloud Director Availability 4.x
Cause
This issue occurs due to a stale HBR database file in the /etc/vmware/ directory on the Replicator appliance.
Resolution
To resolve this issue, contact Broadcom Support and note this Article ID (315059) in the problem description. For more information, see Creating and managing Broadcom support cases.
Additional Information
For more information on taking a backup of a Cloud Director Availability appliance by using the backup archive feature, see the Backing up and restoring in the Cloud Director site section of the Cloud Director Availability documentation.