ERROR
Operation Failed
Synchronization monitoring has stopped. Please verify replication traffic connectivity between the source host and the target vSphere Replication Server. Synchronization monitoring will resume when connectivity issues are resolved.
Operation ID: d76ecf44-f558-4d59-831c-607adf9ce753
6/30/2020, 10:37:47 AM EDT
This is a common error indicating a network communication issue between the ESXi host and the vSphere Replication (VR) Server. This error could be a culmination of various factors that have been discussed in detail in this article.
VMware vSphere Replication 8.x
VMware vSphere Replication 9.x
VMware ESXi 6.x
VMware ESXi 7.x
VMware ESXi 8.x
This issue typically has widespread causes ranging from VR appliance and ESXi host configuration to administrative changes/mistakes to environmental problems including the network, firewall, etc.
Pick a component and follow it's troubleshooting steps depending on your viewpoint and analysis.
1. HOST
2. VRMS
3. NETWORK
4. OVERLAPPING SUBNET OR IP RANGE
TIP: VMware recommends using a dedicated vmnic/s for transferring replication traffic OR using a higher capacity vmnic's (10/20/40Gbps) with vSphere Network I/O control for the best performance.
TIP: Setting the MTU to 9000 on the vSwitches/vDS & the physical switches will give the best replication performance.
NOTE: If these services are not enabled on the relevant VMK adapter, replication traffic will default to VMK0 which is the default management logical interface of the host.
TIP: Setting the MTU to 9000 on the VMK adapter will give the best replication performance.
A. Check the IP settings of all the VMK adapters in the cluster and ensure that the IP address, subnet mask and default gateway are correct.
B. The IP addresses assigned to all these VMK adapters must belong to the same broadcast domain
C. If there are new hosts added to the cluster or if there are hosts missing this VMK adapter; create a new one and fill out the IP settings.
Configuring static routes for vmkernel ports on an ESXi host
A. From the source ESXi host ping the target ESXi host's management interface (By default, its VMK0 but could be different in every vSphere environment)
B. From the source ESXi host ping the target ESXi host's replication interface (Will be the VMK adapter you have configured for replication traffic)
C. Perform test A & B from the target ESXi hosts to the source ESXi host
NOTE: ICMP (ping) is disabled in some environments. It will have to be enabled temporarily to perform these tests.
1. Check if the appliance is powered ON and that it's not hung while booting into the OS or powered OFF.
2. Check if an additional VM network adapter is created for receiving replication traffic, if not follow the steps in this article.
3. Check if the VM network adapter is attached to the correct network and is connected.
4. Verify the network configuration by logging into VRMS Appliance Management Interface
NOTE: The IP configuration via the VRMS Appliance Management Interface supports only one default gateway on the vSphere Replication appliance.
5. If you are unable to make changes to the IP configuration from VAMI, use the commands from the KBs below.
vSphere Replication Appliance and Site Recovery Manager displays the message : No Networking Detected (312781)
Photon Network Manager Commands to update Hostname/IP Address/DNS in SRM & vSphere replication (312686)
6. Adding static routes in the appliance.
Multiple static routes can be added in the 10-eth<NIC_Number>.network file belonging to multiple clusters in the target datacenter. Check if the routes are correct and that all routes belonging to all clusters are added.
NOTE:
1. Source VRMS appliance must have routes for the target ESXi cluster
2. Target VRMS appliance must have routes for the source ESXi cluster
7. vSphere Replication uses SSL/TLS certificates for secure communication between the source and target sites. If the certificates used for replication have expired, are misconfigured, or do not match on both sides, the replication process can fail or not initiate properly.
TROUBLESHOOTING TIPS:
1. From the source VRMS, ping the target host cluster on it's replication VMK adapter & vCenter
2. From the target VRMS, ping the source host cluster on it's replication VMK adapter & vCenter
The replication VMK adapter is either VMK0 or a designated VMK adapter with vSphere Replication traffic & vSphere Replication NFC traffic services enabled on it.
NOTE:
1. It's ideal to PING all hosts in the cluster to verify connectivity
2. Replication appliance upgrades can lead to the loss of static routes, please backup the routes before performing upgrades. You must re-add the routes to 10-eth<NIC_Number>.network file after upgrade completes.
vSphere Replication relies on networking to move traffic, so any ongoing maintenance activity on the network switches, disconnected or loosely connected cables, etc. can directly impact replications. Work on fixing the network issues in the environment before troubleshooting vSphere Replication
1. Check if all the required ports for replication are open
2. Check if there are any NSX firewall policies or normal firewalls blocking replication traffic at the source and target datacenter.
3. Check if the Intrusion Detection System (IDS) or Intrusion Prevention Systems (IPS) settings are interfering with replication traffic.
4. vSphere Replication & Site Recovery Manager DOES NOT support network address translation (NAT).
NOTE: Site Recovery Manager does not support network address translation (NAT). If the network that you use to connect the Site Recovery Manager sites uses NAT, attempting to connect the sites results in an error. Use credential-based authentication and network routing without NAT when connecting the sites.
Overlapping subnet or IP ranges can lead to unpredictable traffic flow causing such errors. Please work with a SRM Engineer first or try to diagnose the problem yourself by following this KB article. We would recommend you to work with your internal networking team and vSphere Networking team to identify the problem.
Please try to diagnose this problem with the help of your internal networking team. If your team effort is not leading you to any clues, then consider logging a case with SRM support and leverage the expertise of our Engineers. SRM support will work with you to diagnose the problem first and then collaborate with vSphere networking team, if needed to find a resolution forward.