Symptoms:
VMware Aria Automation 8.x
Either reboot did not occur post upgrade, or the services did not start automatically after the reboot, as the pods kube-system and prelude namespaces were not running. The console landing pages also show that the appliances are upgraded to the correct version.
The Aria Lifecycle upgrade request will timeout and fail if the Services are not up and running after 4 hours.
1. Take non-memory snapshots of Aria Automation nodes.
2. Reboot the appliances manually.
3. Once the landing page in the console shows, log in and run /opt/scripts/deploy.sh on one node to start the services.
4. After all pods are running, clean up the upgrade progress but running this script on one node:
vracli cluster exec -- bash -c 'rm -rf /data/restorepoint /var/vmware/prelude/upgrade /var/log/vmware/prelude/upgrade-report-latest*; crontab -u root -l | grep -v -F "/opt/scripts/upgrade/upg-mon.sh" | crontab -u root -'