Aria Automation Deployments Timeout
search cancel

Aria Automation Deployments Timeout

book

Article ID: 408805

calendar_today

Updated On:

Products

VCF Automation

Issue/Introduction

When submitting multiple deployment requests concurrently in Aria Automation, some deployments complete successfully, while others consistently timeout and fail. The issue is often observed under load and affects subsequent deployments after an initial set has completed. Logs indicate a request timeout, specifically: "create request Failed: Request timed out after 120 minutes. Please configure project request timeout parameter for long running resource requests."

Environment

Aria Automation 8.18.x

Cloudbolt (OneFuse)

Cause

The primary cause of the deployment timeouts is an external system (e.g., Cloudbolt via OneFuse) hitting its API rate limits, leading to request throttling. This is evidenced by extensibility task failures in the logs, such as:
"Extensibility triggered task failed. ... Failure: Extensibility error received for topic compute.provision.post, eventId = '<eventID>': [10040] ... failed with the following error: Workflow run [<workflow run ID>] completed with error [Error: REST call failed. Status code: 429, error: {"code":429,"errors":[{"message":"Request was throttled. Expected available in 15 seconds."}],"statusCode":429} (Dynamic Script Module name : requestFromAnyType#31)]"

The HTTP 429 (Too Many Requests) status code and the explicit "Request was throttled" message confirm that the external API is temporarily refusing requests due to excessive volume. While Aria Automation continues to wait for a response, the external system's throttling causes the overall deployment to exceed its configured timeout.

Resolution

  1. Open the project and goto the provisioning tab.
  2. Adjust Timeout Value:
    • To Fail Sooner: If rapid feedback on external system bottlenecks is preferred, reduce the timeout value below the default (e.g., 120 minutes). This will cause deployments to fail more quickly when throttling occurs, allowing for faster identification and troubleshooting of the external system's performance.
    • To Allow More Time: If the goal is to allow long-running resource requests, especially those susceptible to temporary throttling, more time to complete, increase the timeout value beyond the default. This gives the external system a larger window to recover from throttling and process the requests successfully.