vSphere HA agent for this host has an error: vSphere HA agent cannot be installed or configured
search cancel

vSphere HA agent for this host has an error: vSphere HA agent cannot be installed or configured

book

Article ID: 316534

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

The purpose of this article is to be able to enable vSphere HA on a cluster.

Symptoms:

  • vSphere HA fails to enable on the cluster
  • vSphere HA election fails
  • ESXi host is unable to exit Maintenance Mode
fdm.log
2022-11-10T08:12:38.945Z info hostd[2804677] [Originator@6876 sub=Solo.Vmomi opID=la6nxy99-273929-auto-5vd6-h5:70062695-b2-01-01-8c-96c4 user=vpxuser:Domain\Domain_User] Result:
--> (vim.fault.PlatformConfigFault) {
--> text = "Starting vmware-fdm:Setting the memory limit for fdm resource pool on this host to 200 MB
--> Setting the memory limit for fdm resource pool on this host to 0 MB
--> failed
-->
--> ",
--> msg = "",
--> }


vpxa.log
2022-11-10T08:12:38.966Z info vpxa[9031411] [Originator@6876 sub=Default opID=la6nxy99-273929-auto-5vd6-h5:70062695-b2-01-01-8c] [VpxLRO] -- ERROR lro-5209363 -- serviceSystem -- vim.host.ServiceSystem.start: vim.fault.PlatformConfigFault:
--> Result:
--> (vim.fault.PlatformConfigFault) {
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = <unset>,
--> text = "Starting vmware-fdm:Setting the memory limit for fdm resource pool on this host to 200 MB
--> Setting the memory limit for fdm resource pool on this host to 0 MB
--> failed
-->
--> "
--> msg = "Received SOAP response fault from [<<io_obj p:0x000000138ab675a8, h:24, <TCP '127.0.0.1 : 15822'>, <TCP '127.0.0.1 : 8307'>>, /sdk>]: start
--> An error occurred during host configuration."
--> }
--> Args:
-->
--> Arg id:
--> "vmware-fdm"


syslog.log
2022-12-15T00:44:37.881Z fdm[2112489]: Successfully dlopen()=/lib64/libconfigstorec.so
2022-12-15T00:44:37.881Z fdm[2112489]: Empty hostlist. Nothing to upgrade.
2022-12-15T00:44:37.881Z fdm[2112489]: Upgrade succeessful for hostlist.
2022-12-15T00:44:37.882Z fdm[2112489]: Empty clusterconfig. Nothing to upgrade.
2022-12-15T00:44:37.882Z fdm[2112489]: Upgrade succeessful for clusterconfig.
2022-12-15T00:44:37.882Z fdm[2112489]: Empty vmmetadata. Nothing to upgrade.
2022-12-15T00:44:37.882Z fdm[2112489]: Upgrade succeessful for vmmetadata.
2022-12-15T00:44:37.883Z fdm[2112489]: Uprade FDM configuration failed with error:Duplicate child: unknownStateMonitorPeriod.
2022-12-15T00:44:37.883Z fdm[2112489]: Upgrade failed for fdm.cfg.
2022-12-15T00:44:37.897Z vmware-fdm[2112492]: Starting vmware-fdm service
2022-12-15T00:44:37.916Z vmware-fdm[2112495]: Invoking config_rp to configure fdm resource pool

vpxd.log
2022-11-10T08:12:38.919Z info vpxd[09953] [Originator@6876 sub=MoHost opID=la6nxy99-273929-auto-5vd6-h5:70062695-b2-01-01] VC state for host host-187928 (uninitialized -> init error), FDM state (UNKNOWN_FDM_HSTATE -> UNKNOWN_FDM_HSTATE), src of state (null -> null)
2022-11-10T08:12:38.935Z info vpxd[09953] [Originator@6876 sub=vpxLro opID=la6nxy99-273929-auto-5vd6-h5:70062695-b2-01-01] [VpxLRO] -- FINISH lro-19700064
2022-11-10T08:12:38.935Z info vpxd[09953] [Originator@6876 sub=Default opID=la6nxy99-273929-auto-5vd6-h5:70062695-b2-01-01] [VpxLRO] -- ERROR lro-19700064 -- -- DasConfig.ConfigureHost: vmodl.fault.SystemError:
--> Result:
--> (vmodl.fault.SystemError) {
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = <unset>,
--> reason = "Failed to start fdm service on host [vim.HostSystem:host-187928,esxihost.vmware.local]"
--> msg = ""
--> }
--> Args:
-->



Environment

VMware vCenter Server 7.0.x
VMware vSphere ESXi 7.0.x
VMware vCenter Server 8.0.x
VMware vSphere ESXi 8.0.x

Cause

Due to the incorrect formatting or duplicate entries within the fdm.cfg file, enabling vSphere HA fails.
 
<fdm>
<memReservationMB>200</memReservationMB>
<memoryCheckerTimeInSecs>0</memoryCheckerTimeInSecs>
<unknownStateMonitorPeriod>30</unknownStateMonitorPeriod>
<unknownStateMonitorPeriod >30</unknownStateMonitorPeriod>
<unknownStateMonitorPeriod >30</unknownStateMonitorPeriod >
</fdm>
In the above example, there are 2 additional duplicate lines of "unknownStateMonitor Period." Additionally, there are extra spacings in the line.

Resolution

Please see the below workaround.

Workaround:
Option 1:
  1. Confirm vSphere HA is disabled on the cluster
  2. SSH to each of the ESXi hosts in the cluster
  3. On each ESXi host, remove the FDM vib:
esxcli software vib remove -n vmware-fdm
  1. Once the FDM vib has been removed on all participating ESXi hosts in the cluster, enable vSphere HA


Option 2:

  1. Confirm vSphere HA is disabled on the cluster
  2. SSH to each of the ESXi hosts in the cluster
  3. On each ESXi host, modify the fdm.cfg file:
    • cd /etc/opt/vmware/fdm
    • cp .#fdm.cfg fdm.cfg
    • /etc/init.d/vmware-fdm restart 
  4. Once the FDM service has been restarted on all participating ESXi hosts in the cluster, enable vSphere HA