Appliance booted into Emergency mode with Error: "Failed to start Switch Root" after attempting to upgrade or apply Hot Fix
search cancel

Appliance booted into Emergency mode with Error: "Failed to start Switch Root" after attempting to upgrade or apply Hot Fix

book

Article ID: 379012

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • Primary node booted into Emergency mode with Error: "[FAILED] Failed to start Switch Root." after attempting to upgrade or apply Hot Fix
  • Running command: e2fsck -y on /dev/sda2 and /dev/sda4 are clean
  • Trying to upgrade Aria Operations for Logs to via LCM and the upgrade procedure fails on powering on the Master VM after snapshot
  • It seems like the Master node is corrupt and fails to boot and even after reverting to snapshot, the master node fails to boot

 

Environment

Aria Operations for Logs 8.18.x

Cause

Previous attempt to upgrade to Aria Operations for Logs failed on the Primary node.  New Initrd image was added during the process but the upgrade never completed successfully.  Primary node isn't able to boot up properly on the new image as the environment is still on the old 8.18.x platform.

From the above example screenshot.  Aria Operations 8.18 has the initrd.img-5.10.216-2.ph4.  In addition, the new initrd.img-5.10-219-1.ph4 was added from the failed upgrade to HF1.  Grub was modified to point to the new image, causing the node to boot into Emergency mode.

Resolution

To fix the boot issue, the initial step is to boot up into the original Photon OS, instead of the Photon OS latest.  This enables grub to boot into the proper initrd image.

Once the node has been booted up successfully, the next step is to remove the new image from /boot folder.  This frees up space for /boot partition as it is configured with very limited space.  

  • cd /boot
  • ls -l
  • Remove the newest files, below is an example (replace file names with version specific to the failed upgrade)
    • rm initrd.img-5.10.219-1.ph4
    • rm vmlinuz-5.10.219-1.ph4

Once the /boot disk space clean up has been completed, the next step is to perform the Aria Operations upgrade manually on the Primary node via the following KB article:

KB344057 - How to Manually Upgrade VMware Aria Operations for Logs via Command Line

Once the upgrade is successful, the Primary node will reboot, and the UI will be accessible.

The subsequence upgrade on all the worker nodes will be automated.  The progress of the upgrade will be displayed on the Cluster status page of the Primary node UI.

NOTE: In the case of LCM, once the upgrade is successfully completed manually, you must run an inventory sync in LCM for the Aria Operations for Logs instance.