Enhanced replications are in 'Not Active (RPO violation)' state due to faulty ESXi physical uplink
search cancel

Enhanced replications are in 'Not Active (RPO violation)' state due to faulty ESXi physical uplink

book

Article ID: 436606

calendar_today

Updated On:

Products

VMware Live Recovery

Issue/Introduction

Symptoms:

  • Multiple virtual machine replications enter a "Not Active (RPO violation)" state. The Site Recovery client displays the following error:

    A replication error occurred at the vSphere Replication Server, Details: 'No connection to VR Server'.

  • Enhanced Replication mapping tests for the target cluster remain in a warning state, indicating network connectivity issues to specific target hosts.

    The source host successfully connected to the target broker but there is no network connectivity between the source host and the target host. Details: 'Connect: Input/output error'.

  • /var/run/log/vmkernel.log on the source ESXi host where the impacted virtual machine resides reports Broken pipe errors and failures to establish connections on port 32032.

    2026-04-07T05:06:52.304Z Wa(180) vmkwarning: cpu80:7633464)WARNING: Hbr: 788: Failed to receive from 127.0.0.1 (groupID=GID-af698a68-e67f-4613-9d12-xxxxxxxxxxxx): Broken pipe
    2026-04-07T05:06:52.304Z Wa(180) vmkwarning: cpu80:7633464)WARNING: Hbr: 2542: Failed to receive extended handshake response
    2026-04-07T05:06:52.304Z Wa(180) vmkwarning: cpu80:7633464)WARNING: Hbr: 5362: Failed to establish connection to [127.0.0.1]:32032 (groupID=GID-af698a68-e67f-4613-9d12-xxxxxxxxxxxx): Broken pipe

  • /var/run/log/hbr-agent.log analysis on the source ESXi host confirms consistent connection timeouts targeting a specific ESXi host in the remote cluster. Cross-referencing these logs with replication mappings identifies this same target host as the source of the connectivity warnings, confirming a localized network path failure affecting all associated virtual machines.

    2026-04-07T05:09:52.302Z In(166) hbr-agent-bin[2104431]: [0x0000009cf85a4640] error: [Proxy [Group: GID-af698a68-e67f-4613-9d12-xxxxxxxxxxxx] -> [#.#.#.176:32032]] Failed to connect to #.#.#.176:32032. Using nic 'vmk2'. Error: Connection timed out
    2026-04-07T05:09:52.302Z In(166) hbr-agent-bin[2104431]: [0x0000009cf85a4640] error: [Proxy [Group: GID-af698a68-e67f-4613-9d12-xxxxxxxxxxxx] -> [#.#.#.176:32032]] Failed to bind to any of the specified VMKs for connection to #.#.#.176:32032
    2026-04-07T05:09:52.302Z In(166) hbr-agent-bin[2104431]: [0x0000009cf85a4640] error: [Proxy [Group: GID-af698a68-e67f-4613-9d12-xxxxxxxxxxxx] -> [#.#.#.176:32032]] Failed to connect to server #.#.#.176:32032 using broker info: Input/output error
    2026-04-07T05:09:52.303Z In(166) hbr-agent-bin[2104431]: [0x0000009cf8625640] error: [Proxy [Group: GID-af698a68-e67f-4613-9d12-xxxxxxxxxxxx] -> [#.#.#.176:32032]] Exhausted all server endpoints reported by broker.
  • vmkping tests from the source host to the target replication interface fail with 100% packet loss.

    vmkping -I vmk2 -s 1472 #.#.#.176
    PING #.#.#.176 (#.#.#.176): 1472 data bytes

    --- #.#.#.176 ping statistics ---
    3 packets transmitted, 0 packets received, 100% packet loss

Environment

vSphere Replication 9.x

Cause

A faulty physical network adapter (vmnic) on the target ESXi host, utilized by the replication VMkernel interface, causes intermittent or total network connectivity failure to the source ESXi hosts.

Cause Validation:

  • Identify the faulty physical uplink by running esxtop (press 'n' for networking) to see which vmnic is currently teamed with the replication vmk interface.

    PORT-ID USED-BY                         TEAM-PNIC DNAME              PKTTX/s  MbTX/s   PSZTX    PKTRX/s  MbRX/s   PSZRX %DRPTX %DRPRX
      ######## vmk0                               vmnic4 DvsPortset-0        222.97    0.54  318.00     250.43    0.47  244.00   0.00   0.00
      ######## vmk1                               vmnic5 DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00
      ######## vmk2                               vmnic1 DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00
      ######## vmk10                              vmnic5 DvsPortset-0         30.33    0.02   66.00       0.00    0.00    0.00   0.00   0.00
  • Check for hardware-level errors on the suspected NIC using the following command: esxcli network nic stats get -n <vmnic_name>

    esxcli network nic stats get -n vmnic1
    NIC statistics for vmnic1
       Packets received: 29524738
       Packets sent: 95811
       Bytes received: 2826380906
       Bytes sent: 15774944
       Receive packets dropped: 0
       Transmit packets dropped: 0
       Multicast packets received: 27913387
       Broadcast packets received: 1371768
       Multicast packets sent: 94234
       Broadcast packets sent: 1450
       Total receive errors: 744
       Receive length errors: 0
       Receive over errors: 0
       Receive CRC errors: 0
       Receive frame errors: 0
       Receive FIFO errors: 0
       Receive missed errors: 0
       Total transmit errors: 0
       Transmit aborted errors: 0
       Transmit carrier errors: 0
       Transmit FIFO errors: 0
       Transmit heartbeat errors: 0
       Transmit window errors: 0

  • Confirm if "Total receive errors" or other error counters are incrementing

Resolution

 

Engage the hardware vendor to troubleshoot or replace the physical NIC/cable/switch port.

Workaround: If the host has redundant uplinks, administratively disable the faulty NIC to force traffic failover to a healthy member of the team:

esxcli network nic down -n <vmnic_name>

Verify that vmkping is now successful and that replications resume synchronization.