Error while collecting data from vCenter cloud account on Aria Automation 8.x
search cancel

Error while collecting data from vCenter cloud account on Aria Automation 8.x

book

Article ID: 377182

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • Data collection on vCenter cloud account fails with error 

"java.util.concurrent.CompletionException: com.vmware.photon.controller.model.adapters.vsphere.VsphereAdapterException: The connection with vAPI endpoint can not be established. Cause: null" 

  • Cloud account endpoint marked as suspended after the health check failed for a consecutive number of times. After fixing the issues related with the health check and revalidating the cloud account, enumeration continues to be skipped and doesn't run.
  • Some event seen in provisioning logs:

provisioning-service-app.log

2024-08-29T07:55:52.740Z WARN provisioning [host='provisioning-service-app-##########-#####' thread='xn-index-queries-4' user='provisioning-###########' org='#########-####-####-####-############' trace='########-####-####-####-############' parent='########-####-####-####-############' span='#########-####-####-####-############'] c.v.xenon.common.ServiceErrorResponse.create:85 - message: Failed to send enum request. Endpoint [/resources/endpoints/#########-####-####-####-############] is in SUSPENDED state., statusCode: 400, serverErrorId: ########-####-####-####-############: Caused by null
provisioning-service-app.log.#:1781:"message": "Failed to send enum request. Endpoint [/resources/endpoints/###########-####-####-####-############] is in SUSPENDED state.", 



Environment

VMware Aria Automation 8.x
VCF Automation 9.x

Cause

More than 10 consecutives connection failures to the endpoint due to expired vCenter password, vAPI Service issues or expired certificate.

Resolution

If assistance with this procedure is desired, please open a support request and reference this KB article.

NOTE: Take a non-memory snapshot of all the vRA nodes, also be sure that there are known good backups to restore from.

Steps for vRA 8.x:

  • Access the provisioning-db using this command:

# vracli dev psql provisioning-db

  • Health check the endpoint resources with below query:

select document_self_link, name, custom_properties->>'consecutiveFailuresCount' as consecutiveFailuresCount, custom_properties->>'endpointHealthCheckState' as endpointHealthCheckState from endpoint_state;

  • Find the endpoint which has reached 10 failures or is in SUSPENDED state, copy their value in the document_self_link column and use the following commands to reset their failures count to 0 and set the endpoint_state to AVAILABLE:

UPDATE endpoint_state SET custom_properties = jsonb_set(custom_properties, ' {consecutiveFailuresCount}
', '0', true) WHERE document_self_link = '<DOCUMENT_SELF_LINK_VALUE>';
UPDATE endpoint_state SET custom_properties = jsonb_set(custom_properties, '
{endpointHealthCheckState}
', '"AVAILABLE"', true) WHERE document_self_link = '<DOCUMENT_SELF_LINK_VALUE>';

     •  Re-validate & save the cloud account and wait for the data collection cycle to start.

Example:




Steps for VCFA 9.0:

  • SSH into the VCFA appliance
  • Find the kubeconfig file (usually at the root) - it should be named something like vmsp-6ab8037.kubeconfig
  • Find the master postgres pod

    kubectl --kubeconfig <vmsp-file.kubeconfig> get pod -n prelude -l  spilo-role=master

  • Run the following command to access the provisioning database:

    kubectl --kubeconfig <vmsp-file.kubeconfig from step 2> exec -it <master-postgres-pod-name from step 3> -n prelude -- /bin/bash

  • Run the following commands in order:

    su postgres

    psql

    \c provisioning_db 

  • Run the following command to get a list of endpoints and their state:

    select document_self_link, name, custom_properties->>'consecutiveFailuresCount' as consecutiveFailuresCount, custom_properties->>'endpointHealthCheckState' as endpointHealthCheckState from endpoint_state;

  • From the list above, find the endpoint which has reached 10 failures or is in SUSPENDED state. Copy the value from its document_self_link column and use the following commands to reset their failures count to 0 and set the endpoint_state to AVAILABLE:

    UPDATE endpoint_state SET custom_properties = jsonb_set(custom_properties, '{consecutiveFailuresCount}', '0', true) WHERE document_self_link = '<DOCUMENT_SELF_LINK_VALUE>';

    UPDATE endpoint_state SET custom_properties = jsonb_set(custom_properties, '{endpointHealthCheckState}', '"AVAILABLE"', true) WHERE document_self_link = '<DOCUMENT_SELF_LINK_VALUE>';

  • Exit the database by running the following commands:

    \q

    exit