VMware vRealize Network Insight upgrade from 6.7.0 to 6.8 or 6.9 gets stuck at step 37 of 38 with Collector VM showing In Progress status
search cancel

VMware vRealize Network Insight upgrade from 6.7.0 to 6.8 or 6.9 gets stuck at step 37 of 38 with Collector VM showing In Progress status

book

Article ID: 324430

calendar_today

Updated On:

Products

VCF Operations for Networks VMware vRealize Network Insight 6.x

Issue/Introduction

An upgrade is stuck with collector VM in InProgress status

  1. Upgrade GUI screen shows VMware vRealize Network Insight collector(s) InProgress state for more than 6+ hours, UI shows Platform is upgraded to 6.8.0 or 6.9.0

    Expanding In Progress does not show any error message on the vRNI upgrade GUI, refer to screenshot below:



    2. Node version mismatch error will be seen in GUI for collector(s) when navigated to Settings>Infrastructure and update page, under collector VM(s) table.

    3. Running command df- h on vRNI collector VM shows /boot partition 99%, see highlighted in yellow below:
ubuntu@vrni-proxy-release:~$ df -h
Filesystem                    Size  Used Avail Use% Mounted on
udev                          5.9G     0  5.9G   0% /dev
tmpfs                         1.2G  5.9M  1.2G   1% /run
/dev/mapper/vg-root            30G  6.6G   22G  24% /
tmpfs                         5.9G  8.0K  5.9G   1% /dev/shm
tmpfs                         5.0M     0  5.0M   0% /run/lock
none                          5.9G     0  5.9G   0% /run/shm
/dev/sda1                     184M  172M  3.4M  99% /boot
/dev/mapper/vg-home            15G  9.7G  4.2G  71% /home
/dev/mapper/vg-tmp             15G  995M   13G   8% /tmp
/dev/mapper/vg-var             53G  7.4G   43G  15% /var
/dev/mapper/vg-var+log         25G   16G  7.8G  67% /var/log
/dev/mapper/vg-var+log+audit   15G   55M   14G   1% /var/log/audit
tmpfs                         1.2G     0  1.2G   0% /run/user/999
tmpfs                         1.2G     0  1.2G   0% /run/user/1002

4. In logs location /var/log/arkin/launcher/latest.log, below entries are seen.

2022-11-08T17:16:35.454Z INFO launcher.upgrade.ServiceBundleDownloader launcher-upgrade-exec-0 verifySignedBundle:288 Successfully verified signed bundle /tmp/upgrade_infra-base-1666364233-baf013bb340622b13b5e57b9d00b3a799387189401287057995.tgz.bundle and created file /tmp/upgrade_infra-base_166636423344403971197194585.tar.gz
2022-11-08T17:16:35.454Z INFO servicemgmt.launcher.Launcher launcher-upgrade-exec-0 stopService:224 Stopping infra-base
2022-11-08T17:16:35.454Z INFO servicemgmt.launcher.Launcher launcher-upgrade-exec-0 stopService:227 No process control for infra-base
2022-11-08T17:16:35.454Z INFO launcher.upgrade.ServiceUpgrader launcher-upgrade-exec-0 upgradeServiceInternal:121 Starting upgrade now for service infra-base
2022-11-08T17:16:35.454Z INFO launcher.upgrade.ServiceUpgrader launcher-upgrade-exec-0 upgradeFromZip:168 Upgrading infra-base from /home/ubuntu/build-target/infra-base
Calling upgrade for infra-base for version 1666364233
infra-base is going to upgrade to 1666364233
Executing: sudo chef-solo -c /home/ubuntu/build-target/infra-base/infra-automation/upgrade_solo.rb -o recipe[kernel::purge]
Info: checked kernel installed list, infra_base will start upgrade process
update-initramfs: Generating /boot/initrd.img-5.4.0-126-generic
xz: (stdout): Write error: No space left on device
E: mkinitramfs failure find 141 cpio 141 xz --check=crc32 1
update-initramfs: failed for /boot/initrd.img-5.4.0-126-generic with 1.
Compression for /boot partition failed for unknown reason.
+ CheckResult 1 'Fail to upgrade infra-base components'
+ RESULT=1
+ ERROR_MSG='Fail to upgrade infra-base components'
+ WAIT=
+ '[' 1 -ne 0 ']'
+ echo Fail to upgrade infra-base components
Fail to upgrade infra-base components
+ sleep
+ exit 1
Failed to perform infra-base upgrade
upgrade unsuccessful


Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware vRealize Network Insight 6.7
VMware vRealize Network Insight 6.8
VMware vRealize Network Insight 6.9
Arai Operations for Networks 6.10
Arai Operations for Networks 6.11

 

 

Cause

This issue is seen when /boot partition on collector shows 100% full.
This is due to older/extra Kernels from previous versions.

Resolution

To workaround this issue perform below mentioned steps:

  1. Take a Putty/SSH session to vRNI collector VM and login using username support

  2. Run below command to check where /boot Filesystem is mounted on 

      support@vrni-proxy-release:~$ df -h /boot

      Filesystem      Size  Used Avail Use% Mounted on
      /dev/sda1       1.9G  122M  1.7G   7% /boot

  3. Run below command to perform cleaning of the extra kernels from older versions

    support@vrni-proxy-release:~$ sudo bash -x /home/ubuntu/build-target/launcher/kernel_purge.sh 2>&1 | tee /home/ubuntu/logs/kernelpurge.log

    Review the the last line of the output of above command, it should show as below:
    ++ sudo df -H --output=size,pcent /boot
    ++ tail -1
    + echo '[Thu Nov 10 02:52:03 UTC 2022] /boot partition is provisioned with  193M 67% used'

  4.  Now run command tail -f  on the vRNI collector VM for the launcher logs to see the upgrade progress.
    support@vrni-proxy-release:~$ sudo tail -F /var/log/arkin/launcher/latest.log

  5. After 30 minutes, the Upgrade should show completed status for collector with GUI showing Update Completed Successfully (Steps 38 of 38 Complete), refer to screenshot below: