VMware vSphere Replication (VR) fails to replicate virtual machines, displaying a status of Not Active.
Symptoms:
Replication error: "VM replication error: A replication error occurred at the vSphere Replication Server for replication 'VM'. Details: 'No connection to VR Server for virtual machine VM on host ESXi_Host_Name in cluster Cluster_Name in Datacenter_Name: Network'."
ICMP pings to the vSphere Replication Appliance from the ESXi host on which the concerned VM resides succeed from the management VMkernel interface, but fail when sourced from the dedicated VMkernel interface for vSphere replication.
The issue typically occurs when the Replication traffic is on a different subnet than the Management traffic and lacks a valid path via the default gateway.
vSphere Replication 8.x
VMware ESXi 7.x / 8.x
The ESXi host lacks a specific routing table entry for the vSphere Replication Appliance's IP address. Consequently, the host attempts to route replication traffic via the default Management gateway instead of the gateway associated with the dedicated replication VMkernel interface. This results in a network timeout/no connection between the ESXi VR agent and the VR Server.
The ping success on vmk0(management service vmkernel) and failure on vmk3(vSphere replication service vmkernel) confirms the VR Appliance is reachable, but the specific replication stack is misrouted. Without a specific route, ESXi follows the default gateway of the primary stack.
[user@esxi_host :~ ] vmkping -I vmk3 <VR_APPLIANCE_IP>PING <VR_APPLIANCE_IP> (<VR_APPLIANCE_IP>): 56 data bytessendto () failed (Network is unreachable)[user@esxi_host :~ ] vmkping -I vmk0 <VR_APPLIANCE_IP>PING <VR_APPLIANCE_IP> (<VR_APPLIANCE_IP>): 56 data bytes64 bytes from <VR_APPLIANCE_IP>: icmp seq=0 ttl=63 time=0.999 ms64 bytes from <VR_APPLIANCE_IP>: icmp seq=1 ttl=63 time=1.017 ms64 bytes from <VR_APPLIANCE_IP>: icmp seq=2 ttl=63 time=1.430 ms
Command output for "esxcli network ip route ipv4 list" will show that the dedicated replication interface (vmk3) can not reach the VR Appliance because the ESXi host lacked a specific route for that traffic.
To resolve this issue, add a persistent static route on each affected ESXi host to ensure replication traffic egresses via the correct gateway.
Log in to the affected ESXi host(s) via SSH as root.
Verify the current routing and connectivity from the replication interface: vmkping -I vmk<X> <VR_APPLIANCE_IP> (Replace vmk<X> with your replication VMkernel ID and use the IP of your VR appliance).
Add a persistent static route for the VR Appliance: esxcli network ip route ipv4 add -n <VR_APPLIANCE_IP>/32 -g <REPLICATION_GATEWAY_IP>
Verify the route has been added to the table: esxcli network ip route ipv4 list
Test connectivity again using the vmkping command from Step 2.
Monitor the vSphere UI; replication should transition to Initial Sync or Ongoing.