Requests in Aria Automation reach and then get stuck at the Initialization stage
Aria Automation 8.x
It is possible for pods in Aria Automation to hold on to old kubernetes service IPs that are no longer present in the environment, causing request related activities to hang as resource enumeration cannot complete
First, validate that you are encountering this issue:
grep -r "Service not found" /services-logs/prelude/provisioning-service-app/
2024-09-05T14:00:27.265Z WARN provisioning [host='provisioning-service-app-xxxxxx' thread='xn-index-queries-16' user='' org='' trace='' parent='' span=''] c.v.xenon.common.ServiceErrorResponse.create:83 - message: Service not found: http://10.244.xxx.xxx:8282/provisioning/resource-enumeration-tasks/xxxxxxxxxxxxxxxx, statusCode: 404, serverError Id: xxxxxxxxxxxxxxxx
kubectl get services --all-namespaces | grep '10.244.xxx.xxx'
To resolve restart the services using the following command. This runs for 15-20 minutes in most environments and Aria Automation will be unavailable during this time.
/opt/scripts/deploy.sh
Verify the AGE of the KUBE-SYSTEM pods is less than (30) days.
1. Log into via SSH any of the Aria Automation node(s)
2. Run this command
kubectl get pods -n kube-system
3. If pods AGE are greater then 30 days, restart the Kube-system namespace pods with
kubectl delete pod -n kube-system --all
4. Re-run the following command: /opt/scripts/deploy.sh