Aria Automation Upgrade fails with "Preparation Error"
search cancel

Aria Automation Upgrade fails with "Preparation Error"

book

Article ID: 416907

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • Initiating Aria Automation upgrade / patch from Aria Suite Lifecycle fails with error "LCMVRAVACONFIG90030" at stage - upgrading Aria Automation.
  • Validating the status of the upgrade using command "vracli upgrade status --details" show: 
    Duration:               1 minutes
    Result:                 Preparation Error
    Description:            Preparation for upgrade has discovered problems. Review to error report below to correct the problems and try again. The services remained in working order.
  • /var/log/vmware/prelude/upgrade-noop.log show errors similar to the below: 
    [ERROR][<timestamp>][<node>][Exit Code: 255] Attempt failed to run command: /opt/scripts/upgrade/ssh-noop.sh.
    Pseudo-terminal will not be allocated because stdin is not a terminal.
    Welcome to VMware Aria Automation Appliance 8.18.1
    root@1083: Permission denied (publickey,password).
    [ERROR][<timestamp>][<node>] Remote command failed: /opt/scripts/upgrade/ssh-noop.sh at host: <node>
    [ERROR][<timestamp>][<node>] Remote command failed: /opt/scripts/upgrade/ssh-noop.sh at one or more nodes
  • The upgrade fails, despite having attempted the resolution steps stated in KB-312221,

Environment

  • Aria Automation 8.x

Cause

  • This issue may be observed in either of the below scenarios:
    • The SSH configurations having incorrect permissions:
      • Expected permission is '700'. 
    • The SSH configurations mentioned in /etc/ssh/sshd_config_effective and/or /etc/ssh/sshd_config_desired are corrupted.
      • Files may be empty.  
    • The SSH configurations contain keys following the order from an incorrect version (Example v1 keys on a version expecting to have v2 keys).
      • Steps to identify version mismatch:
        • The /etc/ssh/sshd_config_effective and/ or the /etc/ssh/sshd_config_desired files contain ssh keys of the order of v1 (version followed by earlier releases of Aria Automation), where as it is expected to be of order v2  (version followed by 8.16.x and later releases of Aria Automation) 
          > HostKey /etc/ssh/keys/v1/ssh_host_rsa_key
          > #HostKey /etc/ssh/keys/v1/ssh_host_ecdsa_key
          > HostKey /etc/ssh/keys/v1/ssh_host_ed25519_key
          Note: The version of ssh keys available on each of the nodes can be viewed under /etc/ssh/keys/v1 and /etc/ssh/keys/v2 and ca be reviewed using the command:
           vracli cluster exec -- bash -c "current_node; ls -laR /etc/ssh/"
  • This may be caused by an incomplete upgrade attempt in the past, leading to failure in updating the SSHD configurations.

Resolution

  • To resolve this issue, recreate the expected sshd configurations and re-run the upgrade:
  • On each of the Aria Automation node perform the below steps to create the ideal configuration and validate which sshd configuration file would need to be replaced. 

    • Identify deviation in configuration:
       
      • Connect to the node using SSH with the root user credentials. 
      • Use the below command to generate a temporary file - 'sshd_ideal' with the ideal configuration expected to be held on this version of Aria Automation (8.17 and later should hold v2 version of ssh algorithms).  
        echo "IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjCiMgVGhlIGJlbG93IGNvbmZpZ3VyYXRpb24gcHJvcGVydHkgaXMgYmVpbmcgYWN0aXZlbHkgdXNlZCBieSB0aGUgU1NIIGtleSBnZW5lcmF0aW9uIHNlcnZpY2UgaW4gY29udGV4dCBvZiB0aGUgUHJlbHVkZSBrZXkgZ2VuZXJhdGlvbiBwb2xpY2llcyBhY2NvcmRpbmcgdG8gU1RJRzoKIyBQcmVsdWRlLkhvc3RLZXlWZXJzaW9uIHYyCiMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIwojCiMgVGhpcyBpcyB0aGUgc3NoZCBzZXJ2ZXIgc3lzdGVtLXdpZGUgY29uZmlndXJhdGlvbiBmaWxlLiAgU2VlCiMgc3NoZF9jb25maWcoNSkgZm9yIG1vcmUgaW5mb3JtYXRpb24uCgojIFRoaXMgc3NoZCB3YXMgY29tcGlsZWQgd2l0aCBQQVRIPS91c3IvYmluOi9iaW46L3Vzci9zYmluOi9zYmluCgojIFRoZSBzdHJhdGVneSB1c2VkIGZvciBvcHRpb25zIGluIHRoZSBkZWZhdWx0IHNzaGRfY29uZmlnIHNoaXBwZWQgd2l0aAojIE9wZW5TU0ggaXMgdG8gc3BlY2lmeSBvcHRpb25zIHdpdGggdGhlaXIgZGVmYXVsdCB2YWx1ZSB3aGVyZQojIHBvc3NpYmxlLCBidXQgbGVhdmUgdGhlbSBjb21tZW50ZWQuICBVbmNvbW1lbnRlZCBvcHRpb25zIG92ZXJyaWRlIHRoZQojIGRlZmF1bHQgdmFsdWUuCgojUG9ydCAyMgojQWRkcmVzc0ZhbWlseSBhbnkKI0xpc3RlbkFkZHJlc3MgMC4wLjAuMAojTGlzdGVuQWRkcmVzcyA6OgoKSG9zdEtleSAvZXRjL3NzaC9rZXlzL3YyL3NzaF9ob3N0X3JzYV9rZXkKSG9zdEtleSAvZXRjL3NzaC9rZXlzL3YyL3NzaF9ob3N0X2VkMjU1MTlfa2V5Ckhvc3RLZXkgL2V0Yy9zc2gva2V5cy92Mi9zc2hfaG9zdF9lY2RzYV9zaGEyNTZfa2V5Ckhvc3RLZXkgL2V0Yy9zc2gva2V5cy92Mi9zc2hfaG9zdF9lY2RzYV9zaGEzODRfa2V5Ckhvc3RLZXkgL2V0Yy9zc2gva2V5cy92Mi9zc2hfaG9zdF9lY2RzYV9zaGE1MTJfa2V5CgojIENpcGhlcnMgYW5kIGtleWluZwojUmVrZXlMaW1pdCBkZWZhdWx0IG5vbmUKCiMgTG9nZ2luZwojU3lzbG9nRmFjaWxpdHkgQVVUSAojTG9nTGV2ZWwgSU5GTwoKIyBBdXRoZW50aWNhdGlvbjoKCiNMb2dpbkdyYWNlVGltZSAybQpQZXJtaXRSb290TG9naW4geWVzClN0cmljdE1vZGVzIHllcwpNYXhBdXRoVHJpZXMgMwpNYXhTZXNzaW9ucyAxCgojUHVia2V5QXV0aGVudGljYXRpb24geWVzCgojIFRoZSBkZWZhdWx0IGlzIHRvIGNoZWNrIGJvdGggLnNzaC9hdXRob3JpemVkX2tleXMgYW5kIC5zc2gvYXV0aG9yaXplZF9rZXlzMgojIGJ1dCB0aGlzIGlzIG92ZXJyaWRkZW4gc28gaW5zdGFsbGF0aW9ucyB3aWxsIG9ubHkgY2hlY2sgLnNzaC9hdXRob3JpemVkX2tleXMKQXV0aG9yaXplZEtleXNGaWxlCS5zc2gvYXV0aG9yaXplZF9rZXlzCgojQXV0aG9yaXplZFByaW5jaXBhbHNGaWxlIG5vbmUKCiNBdXRob3JpemVkS2V5c0NvbW1hbmQgbm9uZQojQXV0aG9yaXplZEtleXNDb21tYW5kVXNlciBub2JvZHkKCiMgRm9yIHRoaXMgdG8gd29yayB5b3Ugd2lsbCBhbHNvIG5lZWQgaG9zdCBrZXlzIGluIC9ldGMvc3NoL3NzaF9rbm93bl9ob3N0cwojSG9zdGJhc2VkQXV0aGVudGljYXRpb24gbm8KIyBDaGFuZ2UgdG8geWVzIGlmIHlvdSBkb24ndCB0cnVzdCB+Ly5zc2gva25vd25faG9zdHMgZm9yCiMgSG9zdGJhc2VkQXV0aGVudGljYXRpb24KI0lnbm9yZVVzZXJLbm93bkhvc3RzIG5vCiMgRG9uJ3QgcmVhZCB0aGUgdXNlcidzIH4vLnJob3N0cyBhbmQgfi8uc2hvc3RzIGZpbGVzCiNJZ25vcmVSaG9zdHMgeWVzCgojIFRvIGRpc2FibGUgdHVubmVsZWQgY2xlYXIgdGV4dCBwYXNzd29yZHMsIGNoYW5nZSB0byBubyBoZXJlIQpQYXNzd29yZEF1dGhlbnRpY2F0aW9uIHllcwojUGVybWl0RW1wdHlQYXNzd29yZHMgbm8KCiMgQ2hhbmdlIHRvIG5vIHRvIGRpc2FibGUgcy9rZXkgcGFzc3dvcmRzCkNoYWxsZW5nZVJlc3BvbnNlQXV0aGVudGljYXRpb24gbm8KCiMgS2VyYmVyb3Mgb3B0aW9ucwpLZXJiZXJvc0F1dGhlbnRpY2F0aW9uIG5vCiNLZXJiZXJvc09yTG9jYWxQYXNzd2QgeWVzCiNLZXJiZXJvc1RpY2tldENsZWFudXAgeWVzCiNLZXJiZXJvc0dldEFGU1Rva2VuIG5vCgojIEdTU0FQSSBvcHRpb25zCkdTU0FQSUF1dGhlbnRpY2F0aW9uIG5vCiNHU1NBUElDbGVhbnVwQ3JlZGVudGlhbHMgeWVzCgojIFNldCB0aGlzIHRvICd5ZXMnIHRvIGVuYWJsZSBQQU0gYXV0aGVudGljYXRpb24sIGFjY291bnQgcHJvY2Vzc2luZywKIyBhbmQgc2Vzc2lvbiBwcm9jZXNzaW5nLiBJZiB0aGlzIGlzIGVuYWJsZWQsIFBBTSBhdXRoZW50aWNhdGlvbiB3aWxsCiMgYmUgYWxsb3dlZCB0aHJvdWdoIHRoZSBDaGFsbGVuZ2VSZXNwb25zZUF1dGhlbnRpY2F0aW9uIGFuZAojIFBhc3N3b3JkQXV0aGVudGljYXRpb24uICBEZXBlbmRpbmcgb24geW91ciBQQU0gY29uZmlndXJhdGlvbiwKIyBQQU0gYXV0aGVudGljYXRpb24gdmlhIENoYWxsZW5nZVJlc3BvbnNlQXV0aGVudGljYXRpb24gbWF5IGJ5cGFzcwojIHRoZSBzZXR0aW5nIG9mIFBlcm1pdFJvb3RMb2dpbiB3aXRob3V0LXBhc3N3b3JkLgojIElmIHlvdSBqdXN0IHdhbnQgdGhlIFBBTSBhY2NvdW50IGFuZCBzZXNzaW9uIGNoZWNrcyB0byBydW4gd2l0aG91dAojIFBBTSBhdXRoZW50aWNhdGlvbiwgdGhlbiBlbmFibGUgdGhpcyBidXQgc2V0IFBhc3N3b3JkQXV0aGVudGljYXRpb24KIyBhbmQgQ2hhbGxlbmdlUmVzcG9uc2VBdXRoZW50aWNhdGlvbiB0byAnbm8nLgpVc2VQQU0geWVzCgpBbGxvd0FnZW50Rm9yd2FyZGluZyBubwpBbGxvd1RjcEZvcndhcmRpbmcgbm8KR2F0ZXdheVBvcnRzIG5vClgxMUZvcndhcmRpbmcgbm8KI1gxMURpc3BsYXlPZmZzZXQgMTAKI1gxMVVzZUxvY2FsaG9zdCB5ZXMKI1Blcm1pdFRUWSB5ZXMKUHJpbnRNb3RkIHllcwojUHJpbnRMYXN0TG9nIHllcwpUQ1BLZWVwQWxpdmUgbm8KUGVybWl0VXNlckVudmlyb25tZW50IG5vCkNvbXByZXNzaW9uIG5vCiNDbGllbnRBbGl2ZUludGVydmFsIDAKQ2xpZW50QWxpdmVDb3VudE1heCAyCiNVc2VETlMgbm8KI1BpZEZpbGUgL3Zhci9ydW4vc3NoZC5waWQKI01heFN0YXJ0dXBzIDEwOjMwOjEwMApQZXJtaXRUdW5uZWwgbm8KI0Nocm9vdERpcmVjdG9yeSBub25lCiNWZXJzaW9uQWRkZW5kdW0gbm9uZQoKIyBubyBkZWZhdWx0IGJhbm5lciBwYXRoCkJhbm5lciAvZXRjL2lzc3VlCgojIG92ZXJyaWRlIGRlZmF1bHQgb2Ygbm8gc3Vic3lzdGVtcwpTdWJzeXN0ZW0Jc2Z0cAkvdXNyL2xpYmV4ZWMvc2Z0cC1zZXJ2ZXIKCiMgRXhhbXBsZSBvZiBvdmVycmlkaW5nIHNldHRpbmdzIG9uIGEgcGVyLXVzZXIgYmFzaXMKI01hdGNoIFVzZXIgYW5vbmN2cwojCVgxMUZvcndhcmRpbmcgbm8KIwlBbGxvd1RjcEZvcndhcmRpbmcgbm8KQ2lwaGVycyBhZXMyNTYtZ2NtQG9wZW5zc2guY29tLGFlczEyOC1nY21Ab3BlbnNzaC5jb20sYWVzMjU2LWN0cixhZXMxOTItY3RyLGFlczEyOC1jdHIKTUFDcyBobWFjLXNoYTItNTEyLWV0bUBvcGVuc3NoLmNvbSxobWFjLXNoYTItMjU2LWV0bUBvcGVuc3NoLmNvbSxobWFjLXNoYTItNTEyLGhtYWMtc2hhMi0yNTYKCkFsbG93R3JvdXBzIHdoZWVsCiMJUGVybWl0VFRZIG5vCiMJRm9yY2VDb21tYW5kIGN2cyBzZXJ2ZXIKVXNlUHJpdmlsZWdlU2VwYXJhdGlvbiB5ZXMKUmhvc3RzUlNBQXV0aGVudGljYXRpb24gbm8KCktleEFsZ29yaXRobXMgZGlmZmllLWhlbGxtYW4tZ3JvdXAtZXhjaGFuZ2Utc2hhMjU2LGRpZmZpZS1oZWxsbWFuLWdyb3VwMTYtc2hhNTEyLGRpZmZpZS1oZWxsbWFuLWdyb3VwMTgtc2hhNTEyLGRpZmZpZS1oZWxsbWFuLWdyb3VwMTQtc2hhMjU2Cg==" | base64 -d > /tmp/sshd_ideal
      • Compare the sshd_effective and sshd_desired files with the ideal configuration to identify the modification:
            diff /tmp/sshd_ideal /etc/ssh/sshd_config_effective
            diff /tmp/sshd_ideal /etc/ssh/sshd_config_desired
      • Alternatively, we could compare the checksums of these configurations:
        Note: This can be run from only one node after the ideal configurations are created on all the nodes
        • Generate md5 checksum for the ideal sshd configuration:
             md5sum /tmp/sshd_ideal
        • Generate md5 checksum for the effective sshd configuration:
             vracli cluster exec -- bash -c "current_node; md5sum /etc/ssh/sshd_config_effective"
        • Generate md5 checksum for the desired sshd configuration:
             vracli cluster exec -- bash -c "current_node; md5sum /etc/ssh/sshd_config_desired"

    • Remediate the observed deviation:
       
      • If deviation is noticed for either of the configurations from the ideal config, they would need to be replaced with the ideal configuration:
        • Backup existing effective and/or desired sshd config (invalid - still using V1 keys or corrupted)
          • Effective sshd config:
                vracli cluster exec -- bash -c "current_node; cp /etc/ssh/sshd_config_effective /etc/ssh/sshd_config_effective.bak"
          • Desired sshd config:
                vracli cluster exec -- bash -c "current_node; cp /etc/ssh/sshd_config_desired /etc/ssh/sshd_config_desired.bak"
        • Copy the Ideal configuration (v2) and replace the effective / desired one with it
          • Effective sshd config:
               vracli cluster exec -- bash -c "current_node; cp -a /tmp/sshd_ideal /etc/ssh/sshd_config_effective"
          • Desired sshd config:
               vracli cluster exec -- bash -c "current_node; cp -a /tmp/sshd_ideal /etc/ssh/sshd_config_desired"
      • Clear ssh user keys
           vracli cluster exec -- bash -c "current_node; rm -rf /home/root/.ssh/*
      • Reload ssh configuration
           vracli cluster exec -- bash -c "current_node; systemctl daemon-reload; systemctl restart sshd"
      • Verify /home/root and /home/root/.ssh have the expected permissions (0700 / drwx------)
           vracli cluster exec -- bash -c 'current_node; ls -la /home | grep -E "root root.*root"; ls -la /home/root | grep -E "\.ssh"'
        • Remediation, if the output does not match expected result:
             vracli cluster exec -- bash -c 'current_node; chmod 700 /home/root; chmod 700 /home/root/.ssh'
      • Exit the SSH session

    • Validate the configuration corrections performed:
       
      • Enter new SSH session
      • Verify the correct V2 version of the keys is in use:
           sshd -T
      • Clear upgrade runtime dir
           vracli cluster exec -- bash -c 'rm -rf /data/restorepoint /var/vmware/prelude/upgrade /var/log/vmware/prelude/upgrade-report-latest; crontab -u root -l | grep -v -F "/opt/scripts/upgrade/upg-mon.sh" | crontab -u root -'
      • Clear ssh user keys
           vracli cluster exec -- bash -c "current_node; rm -f /home/root/.ssh/*"

      • NOTE: The Below steps is purely for upgrade prechecks, and can be skipped if not intending to upgrade:
        • Prepare upgrade runtime dir
             /opt/scripts/upgrade/kube-check-health.sh "nodes, pods" "prep"
        • Prepare SSH channel
             /opt/scripts/upgrade/ssh-config-nodes.sh && echo "Completed successfully" || echo "Failed"
          #### Expected result - last line of output is:
           Verification that nodes are able to connect to one another and to this node succeeded.
        • Clear SSH keys (created by previous steps)
             vracli cluster exec -- bash -c "current_node; rm -rf /home/root/.ssh/*"
        • Clear upgrade state (created by previous steps)
             vracli cluster exec -- bash -c 'rm -rf /data/restorepoint /var/vmware/prelude/upgrade /var/log/vmware/prelude/upgrade-report-latest*; crontab -u root -l | grep -v -F "/opt/scripts/upgrade/upg-mon.sh" | crontab -u root -'
        • The upgrade can now be initiated.