VMs in "Not Active" state, RPO Violations or error "Failed to connect to VR server"
search cancel

VMs in "Not Active" state, RPO Violations or error "Failed to connect to VR server"

book

Article ID: 382175

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • Post configuring network isolation for vSphere Replication data and NFC, VMs throw RPO violations errors in the DR UI
  • The VMs go to a "Not Active" state in the DR UI:

  • The network isolation task completes successfully in the VAMI UI:

  • vSphere Replication Isolation Network is configured as per recommendations
  • Communications on virtual NIC and the VMkernel uplink for ports 31031 and ports 443 and 902 in the target site work as expected:

Troubleshooting network and TCP/UDP port connectivity issues on ESXi

  • You may receive error "Failed to connect to VR server" while configuring replication in the DR UI:

Steps to Validate Issue:

  • Run the below command on a source ESXi host for a VM which is powered on and is configured for replication with the effected target Replication Server

#vim-cmd vmsvc/getallvms |grep -i "VM_Name"

  • Check the state of replication of the VM

#vim-cmd hbrsvc/vmreplica.getState "VM ID from previous step"

Retrieve VM running replication state:
        The VM is configured for replication. Current replication state: Group: GID-0789481a-8004-3604-xxxx-yyyy(generation=123456789)
        Group State: lwd delta (instanceId=19065dd0-b292-407f-xxxx-yyyy)
                DiskID RDID-dd42c93d-22e3-3149-xxxx-yyyyState: inactive

  • Check if the target IP of replication for the VM is DIFFERENT from the one configured above:

#cat /vmfs/volumes/XXXX/XXXX/"VM_Name".vmx |grep hbr

scsi0:0.filters = "hbr_filter"
scsi0:0.hbr_filter.rdid = "RDID-dd42c93d-22e3-3149-xxxx-yyyy"
scsi0:0.hbr_filter.persistent = "hbr-persistent-state-RDID-dd42c93d-22e3-3149-xxxx-yyyy.psf"
hbr_filter.configGen = "2"
hbr_filter.gid = "GID-0789481a-8004-3604-8618-a090840dcfd1"
hbr_filter.destination = "xx.yy.zz.123"
hbr_filter.port = "31031"
hbr_filter.rpo = "30"
hbr_filter.netCompression = "TRUE"

 

 

Environment

VMware vSphere ESXi

VMware vSphere Replication 8.x

VMware vSphere Replication 9.x

Network Isolation for vSphere Replication Traffic Configured

Cause

  • This is caused when the "dr-config" service has not been able to successfully change/update the IP in the HMS DB
  • "dr-config" is the service responsible for updating configuration details from the VAMI UI to the HMS DB
  • HMS, while configuring replication, picks up the target IP from its DB
  • The configuration for replication is then pushed to the VMX file for the VM
  • The hbr-agent, which manages the replication for the VMs in the ESXi host, then uses these configuration parameters to replicate the data
  • If the IP in the HMS DB has not been updated when the network configuration is changed, it causes the replication for the VM to run into RPO violations, thereby causing them to go "Inactive"

Resolution

  • If the "hbr_filter.destination" IP is different from the expected IP of the target replication server for replication and NFC:
    • Try to reconfigure replication for the VM and check if the target replication IP gets updated on the ESXi host
    • Re-run the network isolation task on the UI (VAMI) and confirm if the target replication IP gets updated

If the symptoms and issue matches, and the above two steps do not resolve the issue, please contact Broadcom Support to investigate the issue

 

Additional Information