HA fails to enable or install on ESXi hosts when using a HSM
search cancel

HA fails to enable or install on ESXi hosts when using a HSM

book

Article ID: 415431

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • Unable to install or update the vCenter Server vSphere High Availability (vSphere HA) agent service on multiple clusters and hosts
  • After a vCenter is patched, the following alerts may be observed on multiple if not all clusters and/or hosts:
    • vSphere HA host status
    • vSphere HA agent for host <hostname> has an error in <cluster_name> in <datacenter_name> : The vSphere HA agent is not reachable from vCenter server
    • vSphere HA agent for this host has an error: vSphere HA agent cannot be installed or configured
  • The issue affects clusters and hosts that are managed using images via vLCM and a Hardware Support Manager is configured for firmware updates.
    • A firmware component is configured under "Firmware and Drivers Addon" on the affected cluster(s)
  • The option "reconfigure for vSphere HA" is greyed out on ESXi hosts
  • When attempting to enable or re-enable HA on a cluster, the HA status is stuck at "retrying"
  • The task "Configuring vSphere HA on the cluster" with the error "Cannot complete the configuration of the vSphere HA agent on the host "Setting desired image spec for cluster failed""' is observed in the recent tasks pane
  • In the /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server-###.log of the vCenter, the following errors are observed:

YYYY-MM-DDTHH:MM:SS error vmware-vum-server[73519] [Originator@6876 sub=com.vmware.vcIntegrity.lifecycle.SetSolutionTask] [Task, 524] Task:com.vmware.vcIntegrity.lifecycle.SetSolutionTask ID:52989081-0caf-09e8-dc7f-34940140e5e7. Task Failed. Error: Error:
-->    com.vmware.vapi.std.errors.error
--> Messages:
-->    com.vmware.vcIntegrity.lifecycle.drafts.VmaCoreError<Failed to Serialize/Deserialize Object.>
-->


YYYY-MM-DDTHH:MM:SS info vmware-vum-server[73519] [Originator@6876 sub=PM.AsyncTask.SetSolutionTask{294}] [vciTaskBase 1496] SerializeToVimFault fault:
--> (vmodl.fault.SystemError) {
-->    faultCause = (vmodl.MethodFault) null,
-->    faultMessage = (vmodl.LocalizableMessage) [
-->       (vmodl.LocalizableMessage) {
-->          key = "com.vmware.vcIntegrity.lifecycle.drafts.VmaCoreError",
-->          arg = <unset>,
-->          message = <unset>
-->       }
-->    ],
-->    reason = "vLCM Task failed, see Error Stack for details."
-->    msg = "{
-->     "data": null,
-->     "error_type": "ERROR",
-->     "messages": [
-->         {
-->             "args": [],
-->             "default_message": "Failed to Serialize/Deserialize Object.",
-->             "id": "com.vmware.vcIntegrity.lifecycle.drafts.VmaCoreError"

  • In the /var/log/vmware/vmware-updatemgr/vum-server/hsm-service.log file of the vCenter, the follow errors are observed:

HsmService:YYYY-MM-DD HH:MM:SS,214[Dummy-40]hsmService:547 [INFO] Got http response status code: 502
HsmService:YYYY-MM-DD HH:MM:SS,214[Dummy-40]hsmService:558 [ERROR] Error code: 502
HsmService:YYYY-MM-DD HH:MM:SS,214[Dummy-40]hsmService:561 [ERROR] Transient error: None
HsmService:YYYY-MM-DD HH:MM:SS,215[Dummy-40]hsmService:797 [INFO] Hsm service result: {'output': 'null', 'error': {'errorCode': 502, 'command': ('packages', 'get'), 'input': '{"requestContext": null, "optArguments": {"package": "CR-5.4(0.250048)\\ud83d\\udc4d", "version": "5.4(0.250048)"}}', 'args': {'status': 502, 'message': 'Error: Expecting value: line 1 column 1 (char 0) when decoding JSON: <html>\r\n<head><title>502 Bad Gateway</title></head>\r\n<body>\r\n<center><h1>502 Bad Gateway</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n.', 'isTransient': 'null'}}}
HsmService:YYYY-MM-DD HH:MM:SS,310[Dummy-48]hsmService:801 [INFO] Running command from C++: packages get
HsmService:YYYY-MM-DD HH:MM:SS,311[Dummy-48]hsmService:754 [INFO] Running command packages get
HsmService:YYYY-MM-DD HH:MM:SS,311[Dummy-48]hsmService:248 [INFO] Got operation arguments: _OptArgsHolder(task_id=None, host_id=None, package='CR-5.4(0.250048)', version='5.4(0.250048)', release=None)
HsmService:YYYY-MM-DD HH:MM:SS,311[Dummy-48]hsmService:258 [INFO] Initiating executor
HsmService:YYYY-MM-DD HH:MM:SS,311[Dummy-48]hsmService:483 [INFO] Got network location: <IP-address>
HsmService:YYYY-MM-DD HH:MM:SS,311[Dummy-48]hsmService:508 [INFO] Extracted hostname: <IP-address>, port: 443
HsmService:YYYY-MM-DD HH:MM:SS,331[Dummy-48]hsmService:274 [INFO] Successfully initiated executor
HsmService:YYYY-MM-DD HH:MM:SS,331[Dummy-48]hsmService:336 [INFO] Retrieving package: CR-5.4(0.250048) version: 5.4(0.250048)
HsmService:YYYY-MM-DD HH:MM:SS,357[Dummy-48]hsmService:540 [ERROR] Error: Expecting value: line 1 column 1 (char 0) when decoding JSON: <html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>

Environment

VMware vCenter Server 8.x

VMware vSphere ESXi 8.x

Cause

The vCenter patch interrupted the connection between the HSM and HSM service on vCenter, resulting in the HA installation to fail on clusters where HSM is configured for firmware updates.

Resolution

Resynchronize or re-establish a connection between the HSM and the vCenter.

 

If assistance with this is required, please contact the OEM HSM vendor.

Additional Information