In some scenarios, this automated process may ignore vLCM cluster-level remediation settings (such as parallel remediation limits) and attempt to place multiple hosts into maintenance mode simultaneously. This can exhaust cluster compute capacity, leading to DRS errors such as: 'failed getting host recommendation from drs to enter maintenance mode from cluster... reason: There are no active hosts in the cluster.'
YYYY-MM-DDTHH:MM:57.062Z info vmware-vum-server[1408796] [Originator@6876 sub=EHP opID=f4c10152-2332-47cf-b081-##########] CheckContext: {entityMoId: "host-####", vapiSession: "###########", env: {"Host part of VMC": false, "vLCM-VMC integration, Pod service enabled": false, }}, HostCheckContext: {spec: {{ com.vmware.esx.health.hosts.check_spec : { evacuation_action : Optional< >, exclude_checks : [ ] , hosts : Optional< >, maintenance_mode_type : Optional< >, memory_reservation : Optional< >, perspective : BEFORE_MODIFICATION, target_spec : Optional< {{ com.vmware.esx.health.hosts.target_spec : { state_changes : [ {{ map-entry : { key : VMware-HBR-Agent, value : UPGRADE, } }} , ] , } }} >, upgrade_actions : Optional< >, vsan_streched_cluster : Optional< domain-xxxxx>, } }} }
YYYY-MM-DD THH:MM:21.903Z info vmware-vum-server[614236] [Originator@6876 sub=ClusterApplySolutionTask] [Task, 524] Task:com.vmware.vcIntegrity.lifecycle.ClusterApplySolutionTask ID:########-####-###-####-###########. Applying (solution = VMware-HBR-Agent) solution - Start batch (batch count = 1)
YYYY-MM-DDTHH:MM:21.903Z info vmware-vum-server[614236] [Originator@6876 sub=ClusterApplySolutionTask] [Task, 524] Task:com.vmware.vcIntegrity.lifecycle.ClusterApplySolutionTask ID:########-####-###-####-###########. Host Apply Solution - Check Before EnterMaintenanceMode - (cluster=domain-##) - (host=host-##) - (solution=VMware-HBR-Agent)
YYYY-MM-DDTHH:MM:21.903Z info vmware-vum-server[614236] [Originator@6876 sub=ClusterApplySolutionTask] [Task, 524] Task:com.vmware.vcIntegrity.lifecycle.ClusterApplySolutionTask ID:########-####-###-####-###########. CheckBeforeEnterMaintenanceMode (solution = VMware-HBR-Agent) on (host = host-##) - Solution Component Upgrade - enter MaintenanceMode
YYYY-MM-DDTHH:MM:57.020Z In(166) Hostd[2100975]: [Originator@6876 sub=Vimsvc.ha-eventmgr opID=######## sid=######## user=dcui:vsanmgmtd] Event 150847 : Host .com in ha-datacenter has entered maintenance modeYYYY-MM-DDTHH:MM:28.855Z In(166) Hostd[2100971]: [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 150960 : The host has exited maintenance mode
vSphere Replication 8.7
vSphere Replication 8.8
vSphere Replication 9.x
Active Mitigation / EMERGENCY STOP
If the automated remediation loop is actively running and causing service disruption:
Immediately stop the HMS service on the Replication Appliance to sever the connection to vCenter: systemctl stop hms
Log into the vSphere Client and manually cancel any pending or stuck Apply Solution, Remediate Cluster, or Enter Maintenance Mode tasks.
Apply the configuration change outlined in Workaround 1.
Start the HMS service: systemctl start hms
This behavior is being reviewed and being worked on by Broadcom Engineering for an enhancement.
1. SSH to the vSphere Replication appliance and run one of the commands below depending on the version of replication appliance:
Run the command: /opt/vmware/hms/bin/hms-configtool -cmd reconfig -property hms-auto-install-hbragent-vib=false
2. Manually install the VIBs on each ESXi host:
For environments using vLCM Image Mode, you must extract the offline bundle (VMware-HBR-Agent-xxx.zip) from the vSphere Replication ISO, import it into the vLCM Update repository, and add it to your Cluster Image as an independent component. You can then remediate the cluster safely using standard vLCM procedures.
For full step-by-step instructions, see: [How to manage vSphere replication solution in cluster image manually]
3. NEW: Restart the Host Management Service (HMS) to commit the changes to memory:systemctl restart hmsHow to manage vSphere replication solution in cluster image manually