ESXi hosts automatically enter and exit maintenance mode post vSphere Replication upgrade

Products

VMware Live Recovery

Issue/Introduction

Symptoms:

vSphere Replication 8.7 or later versions upgraded.
ESXi hosts enters and exits maintenance mode to update hbr-agent VIB
This is a rolling placement of hosts in maintenance mode and thus results in migrating VMs out of a host that enters maintenance mode.

Validation:

In /var/log/vmware/vum-server/vmware-vum-server.log from vCenter VLCM checks for the host configuration before sending instructions to place ESXi host in maintenance mode

2025-02-25T23:06:57.062Z info vmware-vum-server[1408796] [Originator@6876 sub=EHP opID=f4c10152-2332-47cf-b081-##########] CheckContext: {entityMoId: "host-xxxx", vapiSession: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", env: {"Host part of VMC": false, "vLCM-VMC integration, Pod service enabled": false, }}, HostCheckContext: {spec: {{ com.vmware.esx.health.hosts.check_spec : { evacuation_action : Optional< >, exclude_checks : [ ] , hosts : Optional< >, maintenance_mode_type : Optional< >, memory_reservation : Optional< >, perspective : BEFORE_MODIFICATION, target_spec : Optional< {{ com.vmware.esx.health.hosts.target_spec : { state_changes : [ {{ map-entry : { key : VMware-HBR-Agent, value : UPGRADE, } }} , ] , } }} >, upgrade_actions : Optional< >, vsan_streched_cluster : Optional< domain-xxxxx>, } }} }

In /var/run/log/hostd.log host enter and exit maintenance mode events are observed before and after vib install

2025-02-25T23:06:57.020Z In(166) Hostd[2100975]: [Originator@6876 sub=Vimsvc.ha-eventmgr opID=4d789244 sid=520406ef user=dcui:vsanmgmtd] Event 150847 : Host .com in ha-datacenter has entered maintenance mode

2025-02-25T23:08:28.855Z In(166) Hostd[2100971]: [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 150960 : The host has exited maintenance mode

Environment

vSphere Replication 8.7

vSphere Replication 8.8

vSphere Replication 9.x

Cause

ESXi hosts entering maintenance mode is triggered by VLCM (VMware Life Cycle Manager) which then pushes hbr-agent VIB to these hosts post vSphere Replication appliance upgrade.
This is an expected behavior and as per the design.

Resolution

This issue has been fixed in VMware Live Recovery Appliance 9.0.4 & VMware ESX 9.0.1.0 that is part of VMware Cloud Foundation 9.0

ESX hosts automatically enter and exit maintenance mode

If you use VMware Life Cycle Manager (vLCM), after upgrading vSphere Replication, the ESX hosts automatically enter and exit maintenance mode.

This issue is fixed in the VMware Live Recovery 9.0.4 appliance.

This behavior is being reviewed and being worked on by Broadcom Engineering for an enhancement.

Workaround 1:

Disable automatic VIB installation/update by vSphere Replication:

1. SSH to the vSphere Replication appliance and run one of the commands below depending on the version of replication appliance:

/opt/vmware/hms/bin/hms-configtool -cmd reconfig -property hms-auto-install-hbragent-vib=false

2. Manually install the VIBs on each ESXi host:

How to manage vSphere replication solution in cluster image manually

Workaround 2:

Please schedule vSphere Replication upgrades during non-business hours, as placing hosts in Maintenance Mode will trigger vMotion, potentially impacting workload performance.

Additional Information

This issue is not observed when running on VCF 4.x or VCF 5.x since VMware Update Manager manages the updates and it uses Baseline mode
This issue is not observed even after moving to vLCM, but if you are still using Baseline mode - However, you will receive alerts on the vCenter server and vSAN Skyline health, prompting to move to image mode