Aria Operations for Networks upgrade from 6.11.0 to 6.12.1 struck at step 112 of 113
Below listed are the 4 symptoms:
1. On the upgrade GUI user would see upgrade not progressing further after step 112 of 113, screenshot below:
2. Collector Node would show Node Version Mismatch as the last step in above screenshots is for the last collector in the deployment. Refer to screenshot below
3. chef.run.log under location /var/log/arkin/
on collector appliance shows below error entries:
ESC[0;1;31mâ—ESC[0m grub-common.service - Record successful boot for GRUB
Loaded: loaded (ESC]8;;file://aria-networks-collector/lib/systemd/system/grub-common.service^G/lib/systemd/system/grub-common.serviceESC]8;;^G; enabled; vendor preset: enabled)
Active: ESC[0;1;31mfailedESC[0m (Result: exit-code) since Mon 2024-08-12 21:56:14 UTC; 5ms ago
Process: 6629 ExecStartPre=/bin/sh -c [ -s /boot/grub/grubenv ] || rm -f /boot/grub/grubenv; mkdir -p /boot/grub (code=exited, status=0/SUCCESS)
Process: 6632 ExecStart=/usr/bin/grub-editenv /boot/grub/grubenv unset recordfail ESC[0;1;31m(code=exited, status=1/FAILURE)ESC[0m
Main PID: 6632 (code=exited, status=1/FAILURE)
CPU: 5ms
Aug 12 21:56:14 aria-networks-collector systemd[1]: Starting Record successful boot for GRUB...
Aug 12 21:56:14 aria-networks-collector grub-editenv[6632]: /usr/bin/grub-editenv: error: invalid environment block.
Aug 12 21:56:14 aria-networks-collector systemd[1]: ESC[0;1;39mESC[0;1;31mESC[0;1;39mgrub-common.service: Main process exited, code=exited, status=1/FAILUREESC[0m
Aug 12 21:56:14 aria-networks-collector systemd[1]: ESC[0;1;38;5;185mESC[0;1;39mESC[0;1;38;5;185mgrub-common.service: Failed with result 'exit-code'.ESC[0m
Aug 12 21:56:14 aria-networks-collector systemd[1]: ESC[0;1;31mESC[0;1;39mESC[0;1;31mFailed to start Record successful boot for GRUB.ESC[0m
dpkg: error processing package grub-common (--configure):
installed grub-common package post-installation script subprocess returned error exit status 1
dpkg: dependency problems prevent configuration of grub2-common:
grub2-common depends on grub-common (= 2.04-1ubuntu26.17); however:
Package grub-common is not configured yet.
dpkg: error processing package grub2-common (--configure):
dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of grub-pc-bin:
grub-pc-bin depends on grub-common (= 2.04-1ubuntu26.17); however:
Package grub-common is not configured yet.
dpkg: error processing package grub-pc-bin (--configure):
dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of grub-pc:
grub-pc depends on grub-common (= 2.04-1ubuntu26.17); however:
Package grub-common is not configured yet.
grub-pc depends on grub2-common (= 2.04-1ubuntu26.17); however:
Package grub2-common is not configured yet.
grub-pc depends on grub-pc-bin (= 2.04-1ubuntu26.17); however:
Package grub-pc-bin is not configured yet.
dpkg: error processing package grub-pc (--configure):
dependency problems - leaving unconfigured
Errors were encountered while processing:
grub-common
grub2-common
grub-pc-bin
grub-pc
STDERR: E: Sub-process /usr/bin/dpkg returned an error code (1)
---- End output of apt-get install -y gnupg ----
Ran apt-get install -y gnupg returned 100
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
4. In the launcher latest.log on collector node shows below entries for "Failed to perform infra-base upgrade
"
2024-08-12T22:31:06.723Z INFO common.utils.InfraMetricEmmiterUtils emitterReporterExecutorService-0 storingInfraMetricsFromRawFiles:89 Running the InfraMetricEmitterControl
+ STATUS=
+ '[' 0 -ne 0 ']'
+ retries=10
+ '[' 10 -gt 1 ']'
+ check_status
++ systemctl is-active datadog-agent.service
+ STATUS=active
+ '[' active = active ']'
+ return 0
+ log_end_msg 0
+ '[' -z 0 ']'
+ '[' '' ']'
+ '[' 0 -eq 0 ']'
+ echo ' ...done.'
...done.
+ return 0
+ exit 0
+ sleep 3
+ '[' -f /usr/lib/systemd/system/telemetry-service.service ']'
+ sudo rm -f /home/ubuntu/build-target/infra-base/telemetry/telemetry-service-sysd-health.sh
+ sudo rm -f /home/ubuntu/build-target/infra-base/telemetry/telemetry-service-sysd-poststop.sh
++ date +%s
+ end=1723501876
+ runtime=80
+ echo 'Execution Time: Datadog configuration 80 secs'
Execution Time: Datadog configuration 80 secs
++ date +%s
+ start=1723501876
+ CONFIG_FILE=/home/ubuntu/build-target/infra-base/ha_cluster/cluster-config.json
+ isThisPlatformCluster
+ isThisPlatform
++ cat /home/ubuntu/build-target/deployment/sku.info
+ CURRENT_SKU=proxy
+ '[' proxy == platform -o proxy == platform-ext-cluster ']'
+ return 1
+ return 1
+ echo 'Skipping cluster unique key distribution as this is not a cluster'
Skipping cluster unique key distribution as this is not a cluster
++ date +%s
+ end=1723501876
+ runtime=0
+ echo 'Execution Time: Cluster unique key 0 secs'
Execution Time: Cluster unique key 0 secs
+ isSaas
+ '[' onprem == onsaas ']'
+ return 1
+ '[' 0 -eq 1 ']'
+ isThisPlatform
++ cat /home/ubuntu/build-target/deployment/sku.info
+ CURRENT_SKU=proxy
+ '[' proxy == platform -o proxy == platform-ext-cluster ']'
+ return 1
+ isThisProxy
++ cat /home/ubuntu/build-target/deployment/sku.info
+ CURRENT_SKU=proxy
+ '[' proxy == proxy ']'
+ return 0
+ sudo rm -rf /home/ubuntu/build-target/infra-base/infra-automation/data_bags/NIAAS-VRNI-NEW-DEV /home/ubuntu/build-target/infra-base/infra-automation/data_bags/NIAAS-VRNI-PRODUCTION /home/ubuntu/build-target/infra-base/infra-automation/data_bags/NIAAS-VRNI-STAGING
+ sudo rm -rf /home/ubuntu/build-target/infra-base-archive/infra-automation/data_bags/NIAAS-VRNI-NEW-DEV /home/ubuntu/build-target/infra-base-archive/infra-automation/data_bags/NIAAS-VRNI-PRODUCTION /home/ubuntu/build-target/infra-base-archive/infra-automation/data_bags/NIAAS-VRNI-STAGING
+ isSaas
+ '[' onprem == onsaas ']'
+ return 1
+ echo 'Cleaning up infra-base'
Cleaning up infra-base
+ sudo rm -vf /home/ubuntu/build-target/infra-base/repo/chef_14.15.6-1_amd64.deb /home/ubuntu/build-target/infra-base/repo/ni-platform-tools-1.0.1707992545.deb /home/ubuntu/build-target/infra-base/repo/vrni-apt-cache_6.12.0_all.deb
removed '/home/ubuntu/build-target/infra-base/repo/chef_14.15.6-1_amd64.deb'
removed '/home/ubuntu/build-target/infra-base/repo/ni-platform-tools-1.0.1707992545.deb'
removed '/home/ubuntu/build-target/infra-base/repo/vrni-apt-cache_6.12.0_all.deb'
+ sudo rm -rvf /home/ubuntu/build-target/infra-base/infra-automation/nodes
+ sudo rm -rvf /home/ubuntu/build-target/infra-base/infra-automation/local-mode-cache
+ echo 'Cleaning up '
Cleaning up
+ ARCHIVE_DIR=/home/ubuntu/build-target/infra-base-archive
+ sudo rm -vf /home/ubuntu/build-target/infra-base-archive/repo/chef_14.15.6-1_amd64.deb /home/ubuntu/build-target/infra-base-archive/repo/ni-platform-tools-1.0.1707992545.deb /home/ubuntu/build-target/infra-base-archive/repo/vrni-apt-cache_6.12.0_all.deb
removed '/home/ubuntu/build-target/infra-base-archive/repo/chef_14.15.6-1_amd64.deb'
removed '/home/ubuntu/build-target/infra-base-archive/repo/ni-platform-tools-1.0.1707992545.deb'
removed '/home/ubuntu/build-target/infra-base-archive/repo/vrni-apt-cache_6.12.0_all.deb'
+ sudo rm -rvf /home/ubuntu/build-target/infra-base-archive/infra-automation/nodes
+ sudo rm -rvf /home/ubuntu/build-target/infra-base-archive/infra-automation/local-mode-cache
+ echo 'upgrade_singlenode.sh completed'
upgrade_singlenode.sh completed
+ exit 0
+ CheckResult 0 'Failed to perform infra-base upgrade'
+ RESULT=0
+ ERROR_MSG='Failed to perform infra-base upgrade'
+ WAIT=
+ '[' 0 -ne 0 ']'
+ echo 6.11.0.1692527086
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
VMware vRealize Network Insight 6.9.0
Aria Operations for Networks 6.10.0
Aria Operations for Networks 6.11.0
Aria Operations for Networks 6.12.0
Aria Operations for Networks 6.12.1
Grub boot loader packages failed to process successfully causing Grub corruption issue and upgrade gets struck at last step.
This happens when deployments are corrupted for grub.
To Resolve this issue, perform below steps:
Take a Putty/SSH session on Collector Node and login with username support
Switch to ubuntu user by using command ub
Execute below commands
sudo grub-editenv /boot/grub/grubenv create
sudo apt-mark unhold apt-cacher-ng
Navigate to location /var/log/arkin/
and tail chef-run.log
Navigate to /var/log/arkin/launcher/
and tail latest.log
You should see upgrade retry happening in launcher latest.log
when you tail it.
The upgrade should completes in next approximately 30-40 minutes.
Aria Operations for Networks GUI shows upgrade complete, screenshots as below: