deploy.sh script, the vco-server-app service fails to start.kubectl get pods -n prelude shows the vco-app pod with a high restart count and a status of CrashLoopBackOff. kubectl describe pod <vco-app-pod-name> -n prelude, the event logs show a "Back-off restarting failed container" message. This document provides a collection of troubleshooting steps and workarounds for common issues related to VMware Aria Automation Orchestrator pods failing to start.
Before attempting manual workarounds, validate that the issue you are experiencing does not match one of these existing known issues:
If the pods are stuck, you can force a reinitialization using one of the following methods.
kubectl get pods -n prelude
vco-app pod instance, run the delete command:
kubectl delete pod -n prelude <vco-app-pod-name>
vco-app and UI deployments to zero replicas:
kubectl scale deployment vco-app --replicas=0 -n prelude
kubectl scale deployment orchestration-ui-app --replicas=0 -n prelude
kubectl scale deployment vco-app --replicas=1 -n prelude
kubectl scale deployment orchestration-ui-app --replicas=1 -n prelude
On older versions of Orchestrator, slow startup times can cause the pods to be terminated prematurely. You can increase the health probe timeouts to allow more time for services to initialize.
Note: These values have been improved in version 8.12.x and later. Do not apply these changes to versions 8.12.x or higher.
vi) to open the deployment configuration file:
vi /opt/charts/vco/templates/deployment.yaml
livenessProbe and readinessProbe sections for the vco-server-app container.initialDelaySeconds, periodSeconds, and failureThreshold to increase the timeout period. For example:
livenessProbe:
failureThreshold: 20
httpGet:
path: /vco/api/health/liveness
port: 8280
scheme: HTTP
initialDelaySeconds: 180
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 10
readinessProbe:
failureThreshold: 20
httpGet:
path: /vco/api/health/readiness
port: 8280
scheme: HTTP
initialDelaySeconds: 180
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 10
/opt/scripts/deploy.sh
*.hprof files are filling up the disk on the appliance.A custom workflow or action may be consuming excessive Java heap memory, causing the Orchestrator service to crash and write its memory content to a .hprof file.
*.hprof file from the appliance to your local machine..hprof file in VisualVM to analyze which threads or workflows are consuming the most memory. This analysis should be performed by the workflow developer to refactor the code for better memory management.vracli vro commands as described in the official documentation.vracli vro cancel executions
Note: This procedure is only applicable for standalone VMware Aria Orchestrator instances and is not supported for the embedded Orchestrator in an Aria Automation deployment.
root.vracli cluster exec -- bash -c 'base64 -d <<< IyBDcmVhdGUgY3VzdG9tIHByb2ZpbGUgZGlyZWN0b3J5Cm1rZGlyIC1wIC9ldGMvdm13YXJlLXByZWx1ZGUvcHJvZmlsZXMvY3VzdG9tLXByb2ZpbGUvCgojIENyZWF0ZSB0aGUgcmVxdWlyZWQgZGlyZWN0b3J5IHRyZWUgdGhhdCB3aWxsIGJlIHVzZWQgd2hlbiB0aGUgcHJvZmlsZSBpcyBhY3RpdmUKbWtkaXIgLXAgL2V0Yy92bXdhcmUtcHJlbHVkZS9wcm9maWxlcy9jdXN0b20tcHJvZmlsZS9oZWxtL3ByZWx1ZGVfdmNvLwoKIyBDcmVhdGUgImNoZWNrIiBmaWxlIHRoYXQgaXMgYW4gZXhlY3V0YWJsZSBmaWxlIHJ1biBieSBkZXBsb3kgc2NyaXB0LgpjYXQgPDxFT0YgPiAvZXRjL3Ztd2FyZS1wcmVsdWRlL3Byb2ZpbGVzL2N1c3RvbS1wcm9maWxlL2NoZWNrCiMhL2Jpbi9iYXNoCmV4aXQgMApFT0YKY2htb2QgNzU1IC9ldGMvdm13YXJlLXByZWx1ZGUvcHJvZmlsZXMvY3VzdG9tLXByb2ZpbGUvY2hlY2sKCiMgQ29weSB2Uk8gcmVzb3VyY2UgbWV0cmljcyBmaWxlIHRvIHlvdXIgY3VzdG9tIHByb2ZpbGUKY2F0IDw8RU9GID4gL2V0Yy92bXdhcmUtcHJlbHVkZS9wcm9maWxlcy9jdXN0b20tcHJvZmlsZS9oZWxtL3ByZWx1ZGVfdmNvLzkwLXJlc291cmNlcy55YW1sCnBvbHlnbG90UnVubmVyTWVtb3J5TGltaXQ6IDYwMDBNCnBvbHlnbG90UnVubmVyTWVtb3J5UmVxdWVzdDogMTAwME0KcG9seWdsb3RSdW5uZXJNZW1vcnlMaW1pdFZjbzogNTYwME0KCnNlcnZlck1lbW9yeUxpbWl0OiA2RwpzZXJ2ZXJNZW1vcnlSZXF1ZXN0OiA1RwpzZXJ2ZXJKdm1IZWFwTWF4OiA0RwoKY29udHJvbENlbnRlck1lbW9yeUxpbWl0OiAxLjVHCmNvbnRyb2xDZW50ZXJNZW1vcnlSZXF1ZXN0OiA3MDBtCkVPRgpjaG1vZCA2NDQgL2V0Yy92bXdhcmUtcHJlbHVkZS9wcm9maWxlcy9jdXN0b20tcHJvZmlsZS9oZWxtL3ByZWx1ZGVfdmNvLzkwLXJlc291cmNlcy55YW1sCg== | bash'
vi /etc/vmware-prelude/profiles/custom-profile/helm/prelude_vco/90-resources.yaml
serverMemoryLimit: 9G
serverMemoryRequest: 8G
serverJvmHeapMax: 7G
/opt/scripts/deploy.sh
Impact/Risks:
VMware Aria Automation or Automation Orchestrator fails to properly boot. Workflows will fail to run until this is resolved.