Enhanced Replication Mapping health check failure: "Connect: Input/output error"
search cancel

Enhanced Replication Mapping health check failure: "Connect: Input/output error"

book

Article ID: 434453

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • Ping between source and target ESXi hosts is successful
  • Traceroute to port 32032 is also successful
  • All VMs are showing Not Active (RPO Violation)

Environment

  • VMware vSphere Replication 9.x
  • VMware Live Recovery 9.x

Cause

  • The issue was initially caused by a transient network connectivity disruption between the HBR agent and the target system on port 32032. This is evident from the logs showing connection timeouts, input/output errors, and failures to bind to VMkernel interfaces (vmk), which resulted in unsuccessful communication attempts.
  • Due to this interruption, the HBR agent and HBR server lost communication, leading to heartbeat failures, host disconnect events, and missing broker mappings on the HBR server. These failures caused the services to enter a stale or inconsistent state, where existing sessions were no longer valid.
  • Although the network connectivity was later restored (confirmed by successful ping and traceroute during troubleshooting), the HBR services did not automatically recover from the earlier failure. As a result, they continued to report connection issues until the services were manually restarted to re-establish proper communication.
  • From the hms.log, the connectivity between the source ESXi host and the broker (target_broker:32032) is healthy, while connectivity to the target ESXi host (target_esxi:32032) is failing.

2026-03-25 15:09:48.925 DEBUG com.vmware.hms.net.HbrAgentHealthMonitorService [hms-main-thread-6612] (..hms.net.HbrAgentHealthMonitorService) [operationID=19e4a92e-585b-4c4a-8027-############-HMS-53240,sessionID=8F7DD48D, operationID=19e4a92e-585b-4c4a-8027-############-HMS-53240,sessionID=8F7DD48D] | Ping test result received: {"group":"PING-GID-8ca2058b-6cf4-45d3-93e2-############","endpoints":{"broker":{"address":"target_broker","port":32032,"connectivity":{"tcp":true,"ssl":true},"latency":{"tcp":{"value":32341,"units":"us"}}},"targets":[{"address":"target_esxi","port":32032,"connectivity":{"tcp":false,"ssl":false},"latency":{"tcp":{"value":75004730,"units":"us"}},"failReason":"Connect: Input/output error"}]}}

  • hbr-agent.log on source esxi indicated source esxi cannot connect to target esx on port 32032

2026-03-25T09:43:43.245Z In(166) hbr-agent-bin[85204367]: [0x00000079c1bc6700] error: [Proxy [Group: GID-51c067cd-8631-4846-8f23-############] -> [target_esxi:32032]] Failed to connect to target_esxi:32032. Using nic 'vmk2'. Error: Connection timed out
2026-03-25T09:43:43.245Z In(166) hbr-agent-bin[85204367]: [0x00000079c1bc6700] error: [Proxy [Group: GID-51c067cd-8631-4846-8f23-############] -> [target_esxi:32032]] Failed to bind to any of the specified VMKs for connection to target_esxi:32032
2026-03-25T09:43:43.245Z In(166) hbr-agent-bin[85204367]: [0x00000079c1bc6700] error: [Proxy [Group: GID-51c067cd-8631-4846-8f23-############] -> [target_esxi:32032]] Failed to connect to server target_esxi:32032 using broker info: Input/output error
2026-03-25T09:43:43.246Z In(166) hbr-agent-bin[85204367]: [0x00000079c1b45700] error: [Proxy [Group: GID-51c067cd-8631-4846-8f23-############] -> [target_esxi:32032]] Exhausted all server endpoints reported by broker.

  • hbrsrv.log on target Broker appliance confirms that the HBR server marked the host as disconnected due to missed heartbeats caused by communication failure.

2026-03-25T13:42:17.550+05:30 info hbrsrv[2884950] [Originator@6876 sub=StatsLog opID=de660c0f-b108-4844-9871-############-HMSINT-57390130] HbrEvent: {"eventID":"hostDisconnect","hostID":"host-200#","hostAddress":"target_esxi","serverID":"52a8d65e-372a-e921-dff0-############","hbrEvent":1}
2026-03-25T13:42:17.550+05:30 info hbrsrv[2884950] [Originator@6876 sub=Host opID=de660c0f-b108-4844-9871-############-HMSINT-57390130] Heartbeat handler detected dead connection for agent: host-200#/hostd
2026-03-25T13:42:17.550+05:30 info hbrsrv[2884950] [Originator@6876 sub=Main opID=de660c0f-b108-4844-9871-############-HMSINT-57390130] HbrError stack:
2026-03-25T13:42:17.550+05:30 info hbrsrv[2884950] [Originator@6876 sub=Main opID=de660c0f-b108-4844-9871-############-HMSINT-57390130]    [0] Exception Vmacore::InvalidStateException: No connection (host-200#/hostd)
2026-03-25T13:42:17.550+05:30 info hbrsrv[2884950] [Originator@6876 sub=Main opID=de660c0f-b108-4844-9871-############-HMSINT-57390130]    [1] Heartbeat failed (host-200#/hostd)
2026-03-25T13:42:17.551+05:30 info hbrsrv[2884950] [Originator@6876 sub=Main opID=de660c0f-b108-4844-9871-############-HMSINT-57390130]    [2] Ignored error.

  • hbrsrv.log on the target ESXi host shows that no valid server mapping was available in the broker due to failed connectivity to the target host, leading to connection failure.

2026-03-25T13:42:56.777+05:30 verbose hbrsrv[1649083] [Originator@6876 sub=Broker groupID=PING-GID-3218e72d-b3a5-4aaa-8b70-############ opID=hsl-0] No server found in broker for group 'PING-GID-3218e72d-b3a5-4aaa-####-27677508665c'
2026-03-25T13:42:56.777+05:30 error hbrsrv[1649083] [Originator@6876 sub=Main groupID=PING-GID-3218e72d-b3a5-4aaa-8b70-############ opID=hsl-0] HbrError for (groupId: "PING-GID-3218e72d-b3a5-4aaa-####-27677508665c") stack:
2026-03-25T13:42:56.777+05:30 error hbrsrv[1649083] [Originator@6876 sub=Main groupID=PING-GID-3218e72d-b3a5-4aaa-8b70-############ opID=hsl-0]    [0] The group does not have a server mapping
2026-03-25T13:42:56.777+05:30 error hbrsrv[1649083] [Originator@6876 sub=Main groupID=PING-GID-3218e72d-b3a5-4aaa-8b70-############ opID=hsl-0]    [1] Converting error to wire failure
2026-03-25T13:42:56.777+05:30 info hbrsrv[1649070] [Originator@6876 sub=Delta] ClientConnection (ClientCnx '[target_esxi]:49154' id=0 <shut> <uninit>) is stopping ...

Resolution

Restart the services on ESXi host:

/etc/init.d/hbr-agent restart

/etc/init.d/hbrsrv restart