Enhanced replication may fail for one or more VMs and it may be successful for the rest of the VMs. However, legacy replication may work as expected.
Upon checking the replication reconfiguration for the VM, we see errors at the network layer.
There are connection errors. Fix the errors before configuring replications.
Fault occurred while performing health check. Details: '503 Service Unavailable from GEThttps://##.##.##.##/hbragent/api/v1.0/appPing?broker_ip=##.##.##.##&broker_port=32032&group=PING-G5714e3f0-a2e1-416d-94f5-########'.
The issue is seen when one or more hosts on the DR site is experiencing CRC errors on the network cards. This is resulting in intermittent connectivity between the hosts while configuring the replication.
This can be validated using the command below on the hosts reporting connectivity issues.
esxcli network nic stats get -n vmnicX
NIC statistics for vmnic0Packets received: 52534452769Packets sent: 14466977718Bytes received: 69320750193840Bytes sent: 7583337761970Receive packets dropped: 0Transmit packets dropped: 0Multicast packets received: 316488322Broadcast packets received: 961798728Multicast packets sent: 396660Broadcast packets sent: 32395Total receive errors: 96106816Receive length errors: 1Receive over errors: 0Receive CRC errors: 96106815 ================>>>> CRC errors on NICs.Receive frame errors: 0Receive FIFO errors: 0Receive missed errors: 728Total transmit errors: 7694740Transmit aborted errors: 0Transmit carrier errors: 7694740Transmit FIFO errors: 0Transmit heartbeat errors: 0Transmit window errors: 0
Please involve physical network engineer to review the reason for CRC errors on the network and fix it.
Fixing the network issue or isolating the affected node from the DR site is expected to resolve the errors and replication to work.