Legacy to Enhanced VM Replication Reconfiguration Results in "Not Active" Replication State
search cancel

Legacy to Enhanced VM Replication Reconfiguration Results in "Not Active" Replication State

book

Article ID: 409577

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • Existing legacy based VM replications are working with no issues.

  • On reconfiguring the existing VM replication to use Enhanced replication results in the replication state to become "Not Active" and the following replication error is seen: "A replication error occurred at the vSphere Replication Server for replication '<vm_name>'. Details: 'No connection to VR Server for virtual machine <vm_name> on host <host_name> in cluster <cluster_name> in <datacenter_name>: Unknown'."":

  • There are no issues with connectivity between the source ESXi and it's vSphere Replication appliance. Command from SSH of source host to vSphere Replication appliance show as connected and vice-versa: openssl s_client -connect <ip>

Environment

VMware vSphere Replication 9.0.x

Cause

  • This issue can occur when the certificate of the source site's vSphere Replication appliance is not complete.
  • When using the openssl command to check connectivity between source ESXi host and its local vSphere Replication appliance, the connection is successful. However, upon verifying the certificate information from the output of the command, some fields in the vSphere Replication appliance's certificate appear as "Unknown":

    Example:
    root@source_esxi] openssl s_client -connect <local_vr_ip>:32032
    Connected
    depth=0 O = Unknown, OU = Unknown, CN = <vr_fqdn/ip>
    .
    .

  • This can also be validated from the VAMI > Certificate page of the vSphere Replication appliance:
  • From the source ESXi host's /var/run/log/hbr-agent.log the following can be seen for this issue:

    YYYY-MM-DDTHH:MM:SS.SSSZ In(166) hbr-agent-bin [3048374]: [0x000000bb69042700] info: [ConfigManager] No user configuration for key=hbrsvc_target_info in ConfigStore.
    YYYY-MM-DDTHH:MM:SS.SSSZ In(166) hbr-agent-bin [3048374]: [0x000000bb69042700] error: [ConfigManager] Failed to got config store object. Comp: esx, Grp: sorvices, Key: hbrsvc_target_info, Id: <local_vr_ip>, Prop: certificate
    YYYY-MM-DDTHH:MM:SS.SSSZ In(166) hbr-agent-bin [3048374]: [0x000000bb69042700] info: [ProxyConnection] Setting up secure tunnel to broker on <local_vr_ip>:32032
    YYYY-MM-DDTHH:MM:SS.SSSZ In(166) hbr-agent-bin [3048374]: [0x000000bb69042700] info: [Proxy [Group: ] -> [<local_vr_ip>:32032]] Connecting to <local_vr_ip>:32032 without specific vmk
    YYYY-MM-DDTHH:MM:SS.SSSZ In(166) hbr-agent-bin [3048374]: [0x000000bb69042700] info: [Proxy [Group: ] -> [<local_vr_ip>:32032]] TCP Connect latency was 2361us
    YYYY-MM-DDTHH:MM:SS.SSSZ In(166) hbr-agent-bin [3048374]: [0x000000bb69042700] error: [Proxy [Group: GID-############] -> [<local_vr_ip>:32032]] The find server request failed: (1) Failed
    YYYY-MM-DDTHH:MM:SS.SSSZ In(166) hbr-agent-bin [3048374]: [0x000000bb69042700] error: [Proxy [Group: GID-############] -> [<local_vr_ip>:32032]] Failed find server request additional error info: Thumbprint and certificate is not allowed to send replication data.
    YYYY-MM-DDTHH:MM:SS.SSSZ In(166)[+] hbr-agent-bin [3048374]: [0x000000bb69042700]: thumbprint: ##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##
    YYYY-MM-DDTHH:MM:SS.SSSZ In(166)[+] hbr-agent-bin [3048374]: [0x000000bb69042700]: certificate:-----BEGIN CERTIFICATE-----
    YYYY-MM-DDTHH:MM:SS.SSSZ In(166)[+] hbr-agent-bin [3048374]: [0x000000bb69042700]: ################################

  • The logs indicate that the "find server" request fails when the source ESXi host attempts to connect to the local vSphere Replication appliance to find a target ESXi host for enhanced replication. This failure is due to an incorrect certificate and thumbprint on the local vSphere Replication appliance.

  • Due to this, the VM's replication state is "Not Active" and replication of VM does not work.

Resolution

To resolve this issue:

  1. Log into VAMI page of the impacted vSphere Replication appliance: https://vr_ip:5480

  2. Click on the Certificate page, and click on "Change". Refer document if using custom or CA-signed certificate - Change the SSL Certificate of the vSphere Replication Appliance.

  3. Once the certificate is changed, reconfigure the vSphere Replication appliance.

  4. Also reconnect the site pair from the Site Recovery page.

Once this is done, the re-configuration of VM replication to use Enhanced replication should work.