Aria Automation Orchestrator upgrade fails with error 'Cleaning up restore point failed on one or more nodes'
search cancel

Aria Automation Orchestrator upgrade fails with error 'Cleaning up restore point failed on one or more nodes'

book

Article ID: 414522

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • Initiating a patch for Aria Automation Orchestrator using 'vracli upgrade exec -y --repo cdrom://' fails with the error below:

"Cleaning up restore point failed on one or more nodes"

  • The "/var/log/vmware/prelude/upgrade-YYYY-MM-DD-HH-MM-SS.log" contains below error:

    [ERROR] Restore point directory /data/restorepoint/live-data/live already exists. Aborting to avoid data corruption.
    [ERROR] Saving at least one restore points for tag live-data failed. Cleaning up tag directory.
    [ERROR] Attempt failed to run command: /opt/scripts/upgrade/rstp-save.sh 'local' live-data live-data.
    [ERROR] Remote command failed: /opt/scripts/upgrade/rstp-save.sh 'local' live-data live-data at one or more nodes
    [ERROR] [Exit Code: 1] Saving restore points on all nodes failed.

Environment

VMware Aria Automation 8.18.x

Cause

Old upgrade files cause an issue blocking new upgrade or patching.

Resolution

To resolve this issue:

  1. Revert Aria Automation Orchestrator to the snapshot before initiating the patch.

  2. Wait for environment to come up healthy.  Validate by using the following command to ensure all pods are marked as running or completed:

    kubectl get pods -n prelude

  3. Execute the following command on each Automation node to ensure old upgrade and patch data has been cleaned up:

    vracli cluster exec -- bash -c 'rm -rf /data/restorepoint /var/vmware/prelude/upgrade /var/log/vmware/prelude/upgrade-report-latest*; crontab -u root -l | grep -v -F "/opt/scripts/upgrade/upg-mon.sh" | crontab -u root -'

  4. Retry the patch: Upgrade a Standalone or Clustered Automation Orchestrator 8.x Deployment