In VMware Live Recovery 9.0.4 appliance, the following error may be seen in the Enhanced Replication Mappings connection test for all (or a few hosts):
"The source host (id:'host-##', name: '##################') successfully connected to the target broker '##########', but failed to establish a TLS connection between the source host '############' and the target host (id: 'host-##' name: '#############'). The server mappings might not have been updated for the source host '#############' on the target broker '#########' or the target host '###############' certificate has expired. Details: 'Connect: Connection reset by peer'."The netcat and openssl commands for port 32032 show the below errors, even though there is no firewall rule blocking the port:
[root@source_esx:~] nc -zv <target_esx> 32032
nc: connect to <target_esx> 443 port (tcp) failed: Connection refused
[root@source_esx:~] openssl s_client -connect <target_esx>:32032
80BBF83474000000:error:8000006F:system library:BIO_connect:Connection
refused:crypto/bio/bio_sock2.c:114:calling connect()
80BBF83474000000:error:10000067:BIO routines:BIO_connect:connect error:crypto/bio/bio_sock2.c:116:
connect:errno=111
Validation:
Observed multiple errors related to client connection failures and dropped connections:Er(163) hbrsrv[6530583]: [Originator@6876 sub=Main] HbrError stack:Er(163) hbrsrv[6530583]: [Originator@6876 sub=Main] [0] ClientConnection (client=[target_esxi_ip]:52928) request callback failed: Failed to read: End of fileEr(163) hbrsrv[6530583]: [Originator@6876 sub=Main] [1] Dropping error encountered from networkIn(166) hbrsrv[6530577]: [Originator@6876 sub=Delta] HbrSrv cleaning out ClientConnection ([target_esxi_ip]:52928)In(166) hbrsrv[6530583]: [Originator@6876 sub=StatsLog] HbrEvent: {"clientAddress":"[target_esxi_ip]:52928","eventID":"lwdConnectionReset","groupID":"","serverID":"00000010-0000-0000-0400-000000000000","vimHostName":"vrep_FQDN","hbrEvent":1}In(166) hbrsrv[6530583]: [Originator@6876 sub=Delta] Destroying client connection (ClientCnx '[target_esxi_ip]:52928' id=0 <shut> <clsd> <uninit>)In(166) hbrsrv[6530582]: [Originator@6876 sub=Delta] ClientConnection (ClientCnx '[target_esxi_ip]:49152' id=0 <shut> <uninit>) is stopping ...
In(166) hbr-agent-bin[6531120]: [0x000000bb7ed16700] error: [Proxy [Group: PING-GID-6a0e71e9-01de-450c-9a40-fdc078e34e48] -> [target_esxi_ip:32032]] [b8eeb1b3-6ad8-494b-b9d9-43ec06465c50-HMS-1355] SSL handshake failed: Connection reset by peerIn(166) hbr-agent-bin[6531120]: [0x000000bb7ed16700] error: [Proxy [Group: PING-GID-6a0e71e9-01de-450c-9a40-fdc078e34e48] -> [target_esxi_ip:32032]] [b8eeb1b3-6ad8-494b-b9d9-43ec06465c50-HMS-1355] Failed to connect to server target_esxi_ip:32032 using broker info: Connection reset by peerIn(166) hbr-agent-bin[6531120]: [0x000000bb7ec95700] error: [Proxy [Group: PING-GID-6a0e71e9-01de-450c-9a40-fdc078e34e48] -> [target_esxi_ip:32032]] [b8eeb1b3-6ad8-494b-b9d9-43ec06465c50-HMS-1355] Exhausted all server endpoints reported by broker.In(166) hbr-agent-bin[6531120]: [0x000000bb7ec95700] info: [RESTRequest] [AppPing] [vrep_ipaddress:51152] [b8eeb1b3-6ad8-494b-b9d9-43ec06465c50-HMS-1355] Completing with OKIn(166) hbr-agent-bin[6531120]: [0x000000bb7ec95700] error: [RESTConnection] Error writing response: Broken pipe
ERROR com.vmware.hms.net.HbrAgentHealthMonitorService [hms-main-thread-25] (..hms.net.HbrAgentHealthMonitorService) [] | Error occurred while executing ping test call for group 'PING-GID-4bcc4b64-ace7-4434-9761-732d228a8b5b', broker 'vrep_ipaddress', broker port '32032' from host 'target_esxi_ip'.VMware ESXi 8.x
vSphere Replication 9.x
VMware Live Recovery 9.x
The issue occurs because the source and destination ESXi hosts cannot establish a stable data connection due to a network failure.
The MTU 1500 and 9000 ping test fails between the source and target ESXi hosts.
[root@ esx0001:~ ] vmkping -I vmkx target_esxihost -d -s 8972PING target_esxihost (target_esxihost) : 8972 data bytes
---- target_esxihost ping statistics3 packets transmitted, 0 packets received, 100% packet loss
root@ esx0001:~ ] vmkping -I vmkx target_esxihost -d -s 1472PING target_esxihost (target_esxihost): 1472 data bytes
-- target_esxihost ping statistics ---packets transmitted, 0 packets received, 100% packet loss
Note:
In Enhanced Replication, data traffic flows directly between the source and target ESXi hosts over the WAN. With both hosts configured for MTU 9000, the Maximum Segment Size (MSS) becomes too large for the WAN, resulting in data packet loss.
Recommended to involve the networking team to resolve network connection failure for 9000 MTU between source and destination ESXi hosts.
Else, configure the replication vmkernel adapters to use 1500 MTU on source and target ESX hosts.
Additionally,
Use an isolated network for vSphere Replication traffic, setting MTU to 1500 or 9000 as required.
Isolating Replication traffic prevents network congestion and ensures optimal performance - Isolating the Network Traffic of vSphere Replication.