The vSphere replication status appears as Not Active on a virtual machine configured for vSphere replication
book
Article ID: 343806
calendar_today
Updated On:
Products
VMware Live RecoveryVMware vSphere ESXi
Issue/Introduction
Symptoms:
In a virtual machine configured with vSphere Replication (VR), you experience these symptoms:
The virtual machine replication completes successfully, but the replication status for the virtual machine changes to Not Active, whereas it should be Full Sync.
The vSphere Replication Management Server (VRMS) on the recovery site reports the replication state of the virtual machine as Passive:
<YYYY-MM-DD>T<time> TRACE hms.replication.monitor [hms-pcm-dispacher-thread-2] (..hms.replication.ReplicationMonitor) | A change for group property is received. Group MOID:</time>GID-3b9ab980-####-####-####-########244; Property: state; Operation: assign; Value:passive
VRMS on the protected site reports the replication state as Not Active:
<YYYY-MM-DD>T<time></time> TRACE hms.replication.primaryGroup.status [hms-jobs-main-thread-17] (..db.entities.PrimaryGroupStatusEntity$Adapter) | The status of group GID-3b9ab980-####-####-####-########244 changed tonotActive
In the hostd log file, the ESX host in the protected site where the virtual machine is running reports that it is unable to get the status of the vSphere Replication Server on the recovery site :
2012-02-16T01:49:01.105Z [7A228B90 error 'Hbrsvc'] ReplicationGroupget-server-state request failed, will retry (groupID=GID-3b9ab980-####-####-####-########244): Cannot send after socket shutdown
In the vmkernel.log file, the ESX host on the protected site where the virtual machine is running reports that it cannot communicate with vSphere Replication Server on the recovery site:
2012-02-16T01:50:05.362Z cpu3:2051)WARNING:Hbr: 3855: Failed to establish connection to [192.168.42.23]:31031(groupID=GID-3b9ab980-####-####-####-########244):Connection refused 2012-02-16T01:50:35.366Z cpu3:2051)WARNING:Hbr: 513:Connection failed to 192.168.42.23(groupID=GID-3b9ab980-####-####-####-########244): Connection refused
This issue may occur in VR environments that have one or more of these network configurations:
Network Address Translation (NAT) is used between the two replication sites.
A firewall is configured between the two replication sites and the firewall rules are blocking network ports required for vSphere Replication.
Note: Using NAT in a vSphere Replication environment where some VR components have external IP addresses and others have internal addresses that is not supported.
Resolution
To resolve this issue:
If NAT is used in the VR environment, all VR components must be excluded from the NAT. All VR components must be able to communicate with each other using either internal addresses or external addresses.
To work around this issue, use an IPSEC tunnel between the two sites.
If a firewall is configured between the two replication sites, ensure that the firewall rules are not blocking the VR ports. The two main ports required for vSphere Replication are:
31031 – This port is used during the initial replication traffic from the ESX host at the protected site to the vSphere Replication Server at the recovery site.
44046 – This port is used for ongoing replication traffic from the ESX host at the protected site to the vSphere Replication Server at recovery site.
Notes:
If vSphere replication is configured to replicate virtual machines from the protected to the recovery site, the ESX hosts on the protected site uses these two ports to communicate with the remote vSphere Replication Server on the recovery site.
In security hardened environments, Host Based Replication ports 31031 and 44046 may be disabled or blocked using the host firewall. To enable traffic flow on these ports again: