vRealize Operations 8.1 Upgrade hangs on step 4 of 9

Products

VMware Aria Suite

Issue/Introduction

Symptoms:

The vRealize Operations cluster uses IPv6.
The in-place upgrade to vRealize Operations 8.1 hangs on step 4 of 9 on one or more analytic nodes.
The executing status of the Master node shows Wait for Cassandra to start.
The /storage/log/vcops/log/dbupgrade_timestamp.log on the Master node shows entries similar to:

2020-04-02 17:02:11,884 - INFO - main:601 - Cassandra version has changed, upgrading sstables.
2020-04-02 17:02:11,886 - INFO - cluster_sstable_upgrade:165 - Calling CaSA to upgrade the Cassandra cluster
2020-04-02 17:02:11,887 - DEBUG - getSuiteProperty:140 - STORAGE: '/storage'
2020-04-02 17:02:12,001 - DEBUG - runHttpRequest:203 - Code: 200 - Body: b''
2020-04-02 17:02:12,001 - DEBUG - runHttpRequest:220 - Open URL: 'https://localhost/casa/cassandra/cluster/sstable/upgrade' response code: 200
2020-04-02 17:02:12,002 - INFO - main:603 - Wait for ss table upgrade clusterwide.
2020-04-02 17:02:12,003 - INFO - wait_for_ssupgrade_ready:502 - Calling CaSA to get for ss table status
2020-04-02 17:02:12,003 - DEBUG - getSuiteProperty:140 - STORAGE: '/storage'
2020-04-02 17:02:12,043 - DEBUG - runHttpRequest:203 - Code: 200 - Body: b'false'
2020-04-02 17:02:12,044 - DEBUG - runHttpRequest:220 - Open URL: 'https://localhost/casa/cassandra/cluster/sstable/status' response code: 200
2020-04-02 17:02:12,044 - INFO - wait_for_ssupgrade_ready:515 - retrying in 10 seconds
2020-04-02 17:02:12,055 - INFO - wait_for_ssupgrade_ready:502 - Calling CaSA to get for ss table status
2020-04-02 17:02:12,055 - DEBUG - getSuiteProperty:140 - STORAGE: '/storage'
2020-04-02 17:02:12,104 - DEBUG - runHttpRequest:203 - Code: 200 - Body: b'false'
2020-04-02 17:02:12,104 - DEBUG - runHttpRequest:220 - Open URL: 'https://localhost/casa/cassandra/cluster/sstable/status' response code: 200
2020-04-02 17:02:12,104 - INFO - wait_for_ssupgrade_ready:515 - retrying in 10 seconds

Environment

VMware vRealize Operations 8.1.x

Cause

This is caused by a known issue with IPv6 nodes.

Resolution

To resolve the issue, remove the self-generated IPv6 (Scope:Global) entries from all affected Analytic (Data and Replica) nodes, except the Master node.

Log into an affected node as root via SSH or Console.

Note: Type ALT+F1 in a Console to log in.

Run the following command to output the configuration of the eth0 interface:

ifconfig eth0

In the output, note any entries ending in Scope:Global, except the entry containing the assigned IPv6 address.

Example: Note the bold entries below.
eth0 Link encap:Ethernet HWaddr 00:50:56:88:82:b3
inet6 addr: fc00:10:145:155:250:56ff:fe88:82b3/64 Scope:Global
inet6 addr: fc00:10:145:154:250:56ff:fe88:82b3/64 Scope:Global
inet6 addr: fc00:10:145:153:250:56ff:fe88:82b3/64 Scope:Global
inet6 addr: fc00:10:145:152:aaaa:bbbb:cccc:fcc3/64 Scope:Global
inet6 addr: fc00:10:145:152:250:56ff:fe88:82b3/64 Scope:Global
inet6 addr: fe80::250:56ff:fe88:82b3/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:34217193 errors:0 dropped:22891 overruns:0 frame:0
TX packets:2386776 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2368254775 (2.3 GB) TX bytes:528352773 (528.3 MB)

Run the following command to delete each invalid entry noted in step 3:

ip addr del address dev eth0

Note: Replace address with the address portion of the entries noted in step 3.

Examples:
ip addr del fc00:10:145:155:250:56ff:fe88:82b3/64 dev eth0
ip addr del fc00:10:145:154:250:56ff:fe88:82b3/64 dev eth0
ip addr del fc00:10:145:153:250:56ff:fe88:82b3/64 dev eth0
ip addr del fc00:10:145:152:250:56ff:fe88:82b3/64 dev eth0

Run the following command to re-triage the sstable upgrade action:

curl -X POST -ik -u "admin:password" https://master_address/casa/cassandra/cluster/sstable/upgrade

Note: Replace password with the admin user's password and replace master_address with the IP address or FQDN of the Master node.

Example: curl -X POST -ik -u "admin:VMware1!" https://vrops-ipv6-master.vcloud.local/casa/cassandra/cluster/sstable/upgrade

Repeat steps 1-5 on all affected nodes.

Once completed, the upgrade process will continue as normal.

The upgrade progress can be monitored in the Admin UI, where the Master node's status should move past the Wait for Cassandra to start status.