hostd is not responding when running test recovery or planned migration by SRM and vSphere Replication for over 10 virtual machines
search cancel

hostd is not responding when running test recovery or planned migration by SRM and vSphere Replication for over 10 virtual machines

book

Article ID: 339571

calendar_today

Updated On:

Products

VMware Live Recovery VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • hostd is not responding from vCenter Server when running test recovery or planned migration by SRM and vSphere Replication for over 10 virtual machines
  • Guest OS quiescing setting is enabled.
  • Test recovery or planned migration failed in step 1: Synchronize storage due to:
 
VR synchronization failed for VRM group: testVM_XX. A generic error occurred in the vSphere Replication Management Server. Exception details: 'java.net.SocketTimeoutException: Read timed out'.



Environment

VMware vCenter Site Recovery Manager 8.x
VMware vSphere Replication 8.x
VMware vSphere ESXi 6.5
VMware vSphere ESXi 6.0
VMware vSphere Replication 6.1.x
VMware vSphere ESXi 6.7
VMware Site Recovery Manager 6.1.x

Cause

In test recovery or planned migration, SRM triggers virtual machine's replication for protected virtual machines with creating (quiesed) snapshots at a same time.
Hostd service might not manage to handle too many tasks at a time.

Resolution

This is a known issue.

Currently, there is no resolution.

Workaround:
To workaround the issue, reduce a number of concurrent snapshots not more than 8:
  1. Back up /etc/vmware/hostd/hbrsvc.xml.
# cp /etc/vmware/hostd/hbrsvc.xml /etc/vmware/hostd/hbrsvc.xml.backup
  1. Add the following setting under <ConfigRoot> in /etc/vmware/hostd/hbrsvc.xml:
<hbrMaxConcurrentSnapshotOperations>8</hbrMaxConcurrentSnapshotOperations>
  1. Save /etc/vmware/hostd/hbrsvc.xml.
  2. Restart hostd service (e.g. /etc/init.d/hostd restart).

NOTE: The parameter depends on the environment work load etc..you might need to set the parameter less than 8.