Enhanced Replication Mapping fails and VM replications reported as "Not Active (RPO Violation)"
search cancel

Enhanced Replication Mapping fails and VM replications reported as "Not Active (RPO Violation)"

book

Article ID: 394533

calendar_today

Updated On:

Products

VMware Live Recovery

Issue/Introduction

Symptoms:

  • Enhanced Replication Mappings throws following error:

    "Fault orrurred when performing health check. Details: 'Connect: certificate verify failed (SSL routines)":




  • After making certificate change to the vSphere Replication certificate (renew or change), any changes made to a replicated VM configuration will cause the replication to enter a 'Not Active' status.




  • VM replication may see the following error:

    "A replication error occurred at the vSphere Replication Server for replication: 'No connection to VR Server for virtual machine on host: Unknown'."

 

  • Source ESXi reports following in /var/run/log/hbr-agent.log:

    2025-06-11T06:33:03.370Z hbr-agent-bin[4118517]: 2025-06-11T06:33:03.370764 hbr-agent-bin [4118517] [0x000001003dc36700] error: [Proxy [Group: ] -> [#.#.#.#:32032]] SSL handshake failed: certificate verify failed
    2025-06-11T06:33:03.370Z hbr-agent-bin[4118517]: 2025-06-11T06:33:03.370815 hbr-agent-bin [4118517] [0x000001003dc36700] error: [Proxy [Group: ] -> [#.#.#.#:32032]] Failed to connect to broker on 172.22.112.232:32032: certificate verify failed
    2025-06-11T06:33:03.370Z hbr-agent-bin[4118517]: 2025-06-11T06:33:03.370832 hbr-agent-bin [4118517] [0x000001003dc36700] error: [Proxy [Group: ] -> [#.#.#.#.:32032]] Failed to connect to broker: certificate verify failed


  • Target VR server reports following in /var/log/vmware/hbr.log:

    2025-06-11T06:54:03.441Z error hbrsrv[20244] [Originator@6876 sub=Asio] Cannot perform SSL handshake for <TCP '#.#.#.# : 32032'> -> <TCP '#.#.#.# : 52106'> (encrypted): short read
    2025-06-11T06:54:03.441Z error hbrsrv[20244] [Originator@6876 sub=Main] HbrError stack:
    2025-06-11T06:54:03.441Z error hbrsrv[20244] [Originator@6876 sub=Main]    [0] Exception Vmacore::Exception: Cannot perform SSL handshake for <TCP '#.#.#.# : 32032'> -> <TCP '#.#.#.# : 52106'> (encrypted): short read
    2025-06-11T06:54:03.441Z error hbrsrv[20244] [Originator@6876 sub=Main]    [1] Failed HbrSrv accept on socket ([N9HbrServer20BoostTCPServerSocketE:0x000055ef026f9048])

Environment

vSphere Replication 9.x

Cause

  • Besides vSphere Replication Appliance, certificates on vCenter and ESXi hosts (at both sites) are replaced as well.

  • On the target side (DR) vSphere Replication appliance, you will see the following log lines in the hbrsrv.log file:

Cannot perform SSL handshake for <TCP 'x.x.x.x : 32032' > -> <TCP 'x.x.x.x' : 51018'> (encrypted): tlsvl alert unknown ca (SSL routines)

  • When the HMS services was restarted after the certificate change, the broker certificate was not properly pushed out to all ESXi hosts.



Resolution

  • To resolve this issue, we need to restart the HMS and the HBRSRV services on both vSphere Replication appliances and reconnect them. 

  • Once done the HMS service will push out the new broker certificate out to all ESXi hosts.

  • You can restart these services from the VAMI 

    • Log into the vSphere Replication VAMI (https://VRMS-IP:5480)
    • Navigate to services
    • Select hms / hbrsrv
    • Select restart

  • Alternatively, you can open an SSH session to the vSphere Replication appliance
    • systemctl restart hms
    • systemctl restart hbrsrv

  • Once this is done, 'Reconfigure' the VM replications that showed 'Not Active' status. The replication would sync and then changes the state to 'OK'.