VMware Aria Operations for Logs Upgrade Fails Due to Insufficient Space in /boot Partition
search cancel

VMware Aria Operations for Logs Upgrade Fails Due to Insufficient Space in /boot Partition

book

Article ID: 378978

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

  • During an upgrade of VMware Aria Operations for Logs from version 8.16.1 to 8.18.0, the upgrade process failed due to insufficient space in the /boot partition. 
  • This error was detected in the upgrade log, which showed that at least 41 MB of additional space was required. The upgrade process halted, and corrective actions were needed to resolve the issue.

 

  • Below errors can be seen in /var/log/vmware/loginsight/upgrade.log file.
    • 2024-10-03 04:23:56,286 update-photon-based-boot.sh ERROR There is no enough space in /boot. Required space is 40900 KB.
    • 2024-10-03 04:23:56,286 update-photon-based-boot.sh ERROR Exiting the upgrade process.
    • 2024-10-03 04:23:57,415 initialize-partitions-update.sh INFO Successfully restored boot partition /boot
    • 2024-10-03 04:23:57,418 initialize-partitions-update.sh INFO Cleaning up upgrade files
    • 2024-10-03 04:23:57,781 initialize-partitions-update.sh ERROR Failed
    • 2024-10-03 04:23:57,781 initialize-partitions-update.sh ERROR Exiting the upgrade process.
    • 2024-10-03 04:23:57,784 upgrade-driver INFO Exception occurred!!!
    • 2024-10-03 04:23:57,784 upgrade-driver INFO Cleanining up upgrade version file
    • 2024-10-03 04:23:57,785 upgrade-driver INFO Failed to remove /storage/core/upgrade-version files
    • 2024-10-03 04:23:57,785 upgrade-driver INFO Restarting Log Insight
    • 2024-10-03 04:24:17,592 upgrade-driver INFO Finished running upgrade-driver script.
    • 2024-10-03 04:24:17,609 loginsight-pak-upgrade INFO Exception encountered during key generation, continuing: Strings must be encoded before hashing
    • ERROR: Failed to run partitions update script
    • 2024-10-03 04:24:17,610 loginsight-pak-upgrade INFO Problem occurred when running upgrade: Please check log files for more details.
    • 2024-10-03 04:24:17,615 loginsight-pak-upgrade ERROR Error while running upgrade
    • Traceback (most recent call last):
    • File "/usr/lib/loginsight/application/sbin/loginsight-pak-upgrade.py", line 523, in main raise UpgradeError(err)
    • UpgradeError: 'Problem occurred when running upgrade: Please check log files for more details.'

 

  • Upgrade failure screenshot - UI

 

  • Upgrade failure screenshot - CLI

Environment

  • VMware Aria for Logs 8.16.x

Cause

  • The cause of the upgrade failure is due to kernel files being retained in the /boot directory from previous upgrade attempts that were left in a Failed, Pending, Upgrading, or In-progress state.
  • The upgrade process requires a certain amount of free space in the /boot partition to proceed. During the upgrade, the system logs reported that there was insufficient space (less than the required 41 MB) to complete the process. 
  • Multiple older kernel files were present in the /boot partition, occupying the space needed for the upgrade.

Resolution

  • Please refer to the steps in the KB below to verify the state of previous upgrade attempts. Ensure that the previous upgrades are either in a "Completed" or "Failed" state. Any state other than "Failed" or "Completed" may result in kernel files being retained in the /boot directory.

 

  • Snapshots:
    • Please take valid Snapshots (without memory) of all Aria Operations for Logs nodes before moving forward.

 

  • Verify Node Status:
    • Run nodetool-no-pass status command to ensure that the node is in the UN (Up Normal) state. If the node(s) is in DN state then it is required to resolve the underlying issues first and then proceed with the upgrade.

 

  • Identify the Issue in Logs:
    • Check the upgrade logs using less /var/log/vmware/loginsight/upgrade.log to see if there are any errors related to space. For e.g.

 

  • Check Disk Usage:
    • Verify the space usage of the /boot partition using df -h command. Refer the below screenshot where the output showed that /boot was 75% used (87 MB), with only 30 MB available.

 

  • Locate Current Kernel Files:
    • Use uname -r  command to check the current kernel version that is being used:
      • 5.10.216-2.ph4

 

  • Locate Old Kernel Files:
    • Use cd /boot/ and then du -sh * command within the /boot directory to list large files:
      • initrd.img-5.10.201-1.ph4 (30 MB)
      • vmlinuz-5.10.201-1.ph4 (11 MB)
    • The files listed above and marked in Red in the screenshot below are the older kernel files and the kernel files that are highlighted in Green are the current kernel files that are being used and should NOT be moved anywhere.

 

  • Free Up Space:
    • Since the current kernel version was 5.10.216-2.ph4, the older kernel files were moved to /storage/core/ location using the following commands:
      • mv initrd.img-5.10.201-1.ph4 /storage/core/
      • mv vmlinuz-5.10.201-1.ph4 /storage/core/

 

  • Re-check Disk Usage:
    • Check the disk usage using the df -h command. After moving the older kernel files, the /boot partition usage should decrease to 40% (47 MB used), with 70 MB free space.

 

  • Re-initiate the Upgrade:
    • Re-initiate the upgarde through the UI, and this time the upgrade should complete successfully.

Additional Information

  • Kernel Version after the upgrade to 8.16.1: 5.10.216-2.ph4
  • Important Logs: /var/log/vmware/loginsight/upgrade.log
  • Commands Used:
    • uname -r to check current kernel version
    • du -sh * to identify large files in /boot
    • df -h to monitor disk space usage
    • mv command to move older kernel files