"Cannot complete the configuration of the vSphere HA agent on the host. Setting desired image spec for cluster failed" error occurs when configuring vSphere HA on an image-based cluster.
search cancel

"Cannot complete the configuration of the vSphere HA agent on the host. Setting desired image spec for cluster failed" error occurs when configuring vSphere HA on an image-based cluster.

book

Article ID: 384913

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

  • Enabling vSphere HA on the cluster fails with the following errors:

    "A general system error occurred: Installing HA Components failed on the cluster: domain-c####".
    Cannot complete the configuration of the vSphere agent on the host. "Applying HA VIBs on the cluster encountered a failure". Failed installing HA component on the host: host-####.

    OR

    "Cannot complete the configuration of the vSphere HA agent on the host. Setting desired image spec for cluster failed."
  • Error in /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server.log:

    YYYY-MM-DDTHH:MM:SS error vmware-vum-server[#######] [Originator@#### sub=VumVapi::Utils opID=########-###-####-####-###########] [DepotContentManager ###] Failed to get cached component. No record - VCIDB ERROR: Row with primary key (vsphere-fdm, #.#.#-########) not found in table PM_DEPOT_COMPONENTS

Note :  #.#.#-######## notation is for the vsphere-fdm version, which is different for each vCenter build number.

  • Error in /var/log/vmware/vmware-updatemgr/vum-server/imageservice.log:

    INFO imageService[##############] [SoftwareSpecMgr ####] Image validation result: {'info': [], 'warnings': [], 'errors': [{'id': 'com.vmware.vcIntegrity.lifecycle.EsxImage.ComponentNotFoundError', 'message': {'id': 'com.vmware.vcIntegrity.lifecycle.EsxImage.ComponentNotFoundError', 'default_message': 'Component vsphere-fdm cannot be found in depot.', 'args': ['vsphere-fdm']}, 'resolution': None, 'time': 'YYYY-MM-DD'}]}  vmware.esximage.Errors.ComponentNotFoundError: ('vsphere-fdm', '#.#.#-########', 'Could not find the component with name = vsphere-fdm, version = #.#.#-######## in the depot.')
  • The ESXi host where the vSphere HA configuration is failing will have an older version of the vsphere-fdm agent present, which can be confirmed using the following command from the ESXi host (The build number of the vCenter and the vsphere-fdm vib should match)

esxcli software vib list | grep -i fdm

  • Validating an image at the cluster level on the vCenter results in "Image Validation Failed".

  • Under vCenter's, /var/core directory there are core.updatemgr-worker.##### files are present.

  • The fdm vib was included as both an independent and a solution-managed component, which caused it to conflict with the HA enablement logic

    YYYY-MM-DDThh:mm:ss error vmware-vum-server[#######] [Originator@#### sub=com.vmware.vcIntegrity.lifecycle.CreateOfflineDepotTask] [Task, ###] Task:com.vmware.vcIntegrity.lifecycle.CreateOfflineDepotTask ID:##########################. Task Failed. Error: Error:

    -->    com.vmware.vapi.std.errors.already_exists
    --> Messages:
    -->    com.vmware.vcIntegrity.lifecycle.depots.offline.AlreadyExists<Offline depot content already exists with ID '##########################'.>
    -->
    YYYY-MM-DDThh:mm:ss info vmware-vum-server[#######] [Originator@#### sub=com.vmware.vcIntegrity.lifecycle.CreateOfflineDepotTask] [Task, ###] Task:com.vmware.vcIntegrity.lifecycle.CreateOfflineDepotTask ID:##########################. Task State updated to FAILED
    YYYY-MM-DDThh:mm:ss info vmware-vum-server[#######] [Originator@#### sub=com.vmware.vcIntegrity.lifecycle.PreloadOfflineBundlesTask] [PreloadOfflineBundlesTask ###] The content of offline bundle: /storage/updatemgr/patch-store-temp/vsphere-ha-depot.zip already exists in depot. Treat the import as success.
    YYYY-MM-DDThh:mm:ss info vmware-vum-server[#######] [Originator@#### sub=com.vmware.vcIntegrity.lifecycle.PreloadOfflineBundlesTask] [Task ###] Set com.vmware.vcIntegrity.lifecycle.PreloadOfflineBundlesTask (##################################) progress to 50
    YYYY-MM-DDThh:mm:ss verbose vmware-vum-server[#######] [Originator@#### sub=JobDispatcher] [JobDispatcher ###] The number of tasks: 63

YYYY-MM-DDThh:mm:ss info vmware-vum-server[#######] [Originator@#### sub=Telemetry] [TelemetryManager ###] Sending telemetry data: {"@type":"pman_error_report","taskId":"##############################|##########################","entityId":"#####################################|","parentTaskId":"","errorMessageId":"com.vmware.vcIntegrity.lifecycle.depots.offline.AlreadyExists","errorMessage":"Offline depot content already exists with ID '##########################'.","errorTime":"YYYY-MM-DDThh:mm:ss"}

YYYY-MM-DDThh:mm:ss info vmware-vum-server[#######] [Originator@#### sub=ServiceProvider] [EmbeddedPyServiceProvider ####] The software spec string: {
-->     "add_on": {
-->         "name": "#########",
-->         "version": "803.24280767-###"
-->     },
-->     "alternative_images": null,
-->     "base_image": {
-->         "version": "8.0.3-0.73.24784735"
-->     },
-->     "components": {
-->         "vsphere-fdm": "8.0.3-24674346"
-->     },
-->     "hardware_support": null,
-->     "removed_components": null,
-->     "solutions": {
-->         "com.vmware.vsphere-ha": {
-->             "components": [
-->                 {
-->                     "component": "vsphere-fdm"
-->                 }
-->             ],
-->             "version": "8.0.3-24853646"
-->         }
-->     }
--> }

 

Environment

  • vCenter 9.x
  • vCenter 8.x
  • vCenter 7.x
  • ESX 9.x
  • ESXi 8.x
  • ESXi 7.x

Cause

The vCenter is not able to get the cached component (vsphere-fdm vib) from the VCDB of the updatemgr service psql table PM_DEPOT_COMPONENTS, after vCenter update/upgrade.

Resolution

This is a known issue. Broadcom engineering is actively working on a fix in a future release.

  1. Take a snapshot of vCenter VM (Powered off snapshot of all vCenter Servers if in ELM), refer Snapshot Best practices for vCenter Server Virtual Machines 

  2. SSH to the vCenter server with root credentials.

  3. Enter the command below to enable the shell:

    shell

  4. Stop the update manager service using the command below:

    service-control --stop vmware-updatemgr

  5. Access the postgres database using the command below:

    su updatemgr -s /bin/bash

    psql -U vumuser -d VCDB

  6. View the two tables below before deleting the required entries:

    table pm_software_desired_states;

    table pm_software_compliances;

  7. Delete the entries from the same two tables if all clusters experience the same issue, or if there is only one cluster in the environment

    DELETE FROM pm_software_compliances;

    DELETE FROM pm_software_desired_states;

    Note:
    • If there is more than one cluster in your environment and only one cluster is impacted, then use the below command to delete the desired Cluster details only:

      DELETE FROM pm_software_compliances where desired_state_id in (select desired_state_id from pm_software_desired_states where entity_id='domain-c####');

      DELETE FROM pm_software_desired_states where entity_id='domain-c####';

    • To get the Cluster domain ID, in the vCenter UI, select the cluster from the inventory. And note down the cluster domain ID from the browser URL. It should be similar to domain-c####: exclude everything after the :

  8. Quit and exit from the DB by using the command below:

    \q and hit Enter.

  9. Start a shell session as the root user by using the command below:

    su root -s /bin/bash/

  10. Start the update manager service, using the command below:

    service-control --start vmware-updatemgr

  11. Recreate the cluster image from the vCenter UI. For more information, refer Creating and Managing vSphere Lifecycle Manager Clusters

  12. For environments with NSX-T: Re-register the missing NSX solution directly via CLI on the vCenter appliance to prevent vLCM from removing NSX-T components. Run the following command (replacing <cluster-id> and <version-number> with your actual values):

    dcli com vmware esx settings clusters software solutions set-task --cluster <cluster-id> --solution com.vmware.nsxt --version <version-number> --components '[{"component":"nsx-lcp-bundle"}]'

    Note: For instructions on how to identify the <cluster-id> and NSX <version-number>, refer to Broadcom KB 396675.

  13. Restart the update manager service using the command below:

    service-control --stop vmware-updatemgr && service-control --start vmware-updatemgr

  14. Re-enable the vSphere HA. For more information, refer Disabling and enabling VMware vSphere High Availability (vSphere HA)