vCenter SRM 5.0 Replication status shows as not active after configuration has completed successfully
search cancel

vCenter SRM 5.0 Replication status shows as not active after configuration has completed successfully

book

Article ID: 324746

calendar_today

Updated On:

Products

VMware Live Recovery

Issue/Introduction


Symptoms:
  • VMware vCenter Site Recovery Manager (SRM) 5.0 deployed with vSphere Replication shows a replication status of Not Active.
  • The status is not active even after the configuration of vSphere Replication completes successfully.
  • Initial Replication or Ongoing Replication tasks fail with the warning:

    No VR servers found for replication

  • The vSphere Replication Management Server (VRMS) on the protected site reports the replication state as Not Active:

    2012-02-16 01:26:40.794 TRACE hms.replication.primaryGroup.status [hms-jobs-main-thread-17] (..db.entities.PrimaryGroupStatusEntity$Adapter) | The status of group GID-3b9ab980-6727-49f7-a94c-21cd4716b244 changed to notActive

  • The VRMS on the recovery site reports NFC errors and a Generic Storage error:

    No accessible host for datastore 4f9e0ed0-bfca314b-b715-0017a477083c; Code set to: Generic storage error.; Failed to find host to establish nfc connection
    inherited from com.vmware.vim.binding.hbr.replica.fault.StorageFault: Error for (datastoreUUID: "4f9e0ed0-bfca314b-b715-0017a477083c"), (diskId: "RDID-b0fa8213-412f-4f4f-9829-a2b515d5233c"), (flags: on-disk-open): No accessible host for datastore 4f9e0ed0-bfca314b-b715-0017a477083c; Code set to: Generic storage error.; Failed to find host to establish nfc connection; Failed to open disk, couldn't create NFC session; Set error flag: on-disk-open; Failed to open replica

  • In the vmkernel.log log file, the ESXi host on the protected site where the virtual machine is running reports that it cannot communicate with the vSphere Replication Server on the recovery site:

    2012-05-03T05:45:44.069Z cpu1:4097)WARNING: Hbr: 3855: Failed to establish connection to [10.93.100.230]:31031(groupID=GID-9a1fa8de-cab5-4573-bf9f-30ff21f6967
    9): Failure
    2012-05-03T05:45:45.010Z cpu10:4106)WARNING: Hbr: 1061: Failed to transfer file /vmfs/volumes/4f9e00a1-82a8588a-1479-0017a4770844/ServerB/hbrtmp.1.3792 (groupID=GID-9a1fa8de-cab5-4573-bf9f-30ff21f69679) (offset=0): Cannot send after socket shutdown

  • When logging into SRM using the plugin, you see the error:

    Error is unable to connect to server, the server exceeded the timeout threshold for service requests


Cause

This issue can occur if:
  • The Initial and Ongoing Replication ports (31031 and 44046) are blocked
  • The ESXi host at the protected/recovery site is unable to talk to the vSphere Replication Server (VRS).
  • vCenter Server is unable to validate the SSL Certificates thumbprint with a valid entry in the database. The ESXi SSL certificate thumbprint is registered with HBR has value of null.

Resolution

To troubleshoot this issue:
  1. If a firewall is configured between the two SRM sites, ensure that the firewall rules are not blocking the SRM ports. The two main ports required for vSphere Replication are:
    • 31031 – This port is used during the initial replication traffic from the ESXi host at the protected site to the VRS at the recovery site.
    • 44046 – This port is used for ongoing replication traffic from the ESXi host at the protected site to the VRS at recovery site

      Note: If vSphere Replication is configured to replicate virtual machines from the protected to the recovery site, the ESXi hosts on the protected site uses these two ports to communicate with the remote VRS on the recovery site.

  2. Check if ESX is able to communicate with the the VRS using netcat command or tcpdump.

    Syntax for netcat on ESXi 5.x:

    nc -z IP_Address Port

    For example:

    # nc -z 192.168.48.133 31031
    Connection to 192.168.48.133 31031 port succeeded!

  3. To ensure that vCenter Server and the VRMS is able to pass the SSL Thumbprint information of the ESXi hosts to the VRMS and VRS at the DR site, select vCenter requires verified host SSL certificates in vSphere Client (Administration > vCenter Server Settings > SSL Settings).

  4. The primary and secondary site vCenter Server database should have entries for all ESXi hosts in primary site without any NULL values. To check the vCenter Server database:
    1. Right-click the table dbo.vpx.host.
    2. Click Script Table as > Select to > New Query Editor Window > Execute
    3. Check the Results window and ensure that these columns do not have value of NULL:
      • Expected Thumbprint
      • Host SSL Thumbprint
      • User Name
      • Password
If any of the above of the columns have a value of NULL, perform a repair installation using the existing database) or reinstall vCenter Server to reinitialize the database and remove the NULL values.


Additional Information



Impact/Risks: