How to identify the correct upgrading node during RabbitMQ Tile upgrade 1.12 to 1.13
search cancel

How to identify the correct upgrading node during RabbitMQ Tile upgrade 1.12 to 1.13

book

Article ID: 293204

calendar_today

Updated On:

Products

VMware RabbitMQ

Issue/Introduction

Symptoms:
As part of the upgrade process, during the step "Use the BOSH CLI to SSH into the only running node and verify the contents of the file nodes_running_at_shutdown", the contents of the file has two or more nodes listed.

An example of such a situation is shown below:
cat /var/vcap/store/rabbitmq/mnesia/db/nodes_running_at_shutdown
[rabbit@2d6a7c96149d5cb12be2c06bf9b19042,rabbit@9d5c276e40dbedcf6dda47203cd3584f].

Environment


Cause

This occurred because the order at which the nodes were started was different from the order at which the nodes were shutdown.

In this situation, the upgrade will fail if you leave this node running. This is because it is going to expect to have two nodes running, but there will be only one node running.

Resolution

To prevent the upgrade from failing, you have to identify the RabbitMQ server with only one node listed in nodes_running_at_shutdown.

Following the example above, examine the file erl_inetrc from any RabbitMQ server node. For example:
cat /var/vcap/store/rabbitmq/erl_inetrc
{host, {10,193,80,42}, ["2d6a7c96149d5cb12be2c06bf9b19042"]}.
{host, {10,193,80,43}, ["d15a691d1410cc2fbc7689c16db380be"]}.
{host, {10,193,80,44}, ["9d5c276e40dbedcf6dda47203cd3584f"]}.
{lookup, [file, native]}.
From the node name, you can determine the IP address. In this example, the following IP addresses are determined:
Node name (nodes_running_at_shutdown)Node ID (erl_inetrc)Node IP 
rabbit@2d6a7c96149d5cb12be2c06bf9b190422d6a7c96149d5cb12be2c06bf9b1904210.193.80.42
rabbit@9d5c276e40dbedcf6dda47203cd3584f9d5c276e40dbedcf6dda47203cd3584f10.193.80.44

Once you know the IP address, SSH into the first node listed. In this example, the first node listed is 2d6a7c96149d5cb12be2c06bf9b19042 with the IP address, 10.193.80.42.
​​​​​
To determine what node you have to SSH'ed into, run BOSH instances. For example:
Instance                                               Process State  AZ   IPs
on-demand-broker/aa341f67-f96e-4032-a001-f55e30392588  running        az1  10.193.80.45
rabbitmq-broker/58d32104-3990-466b-8875-a4fe8b5694ed   running        az1  10.193.80.40
rabbitmq-haproxy/5d655a72-d31a-41b9-a3af-fa81e56d6a68  running        az1  10.193.80.41
rabbitmq-server/4920fdd9-2640-4729-a695-252633911b0a   running        az1  10.193.80.42
rabbitmq-server/d683149f-8420-4033-9679-e392ac51d81a   stopped        az1  10.193.80.43
rabbitmq-server/e20b11b1-a1b6-405d-903d-42b8d6a73a38   stopped        az1  10.193.80.44
The IP address 10.193.80.42 corresponds to node rabbitmq-server/4920fdd9-2640-4729-a695-252633911b0a.

SSH into this node. In this node, check the file nodes_running_at_shutdown and it should have only one node listed, which should be the node you just SSH'ed into:
cat /var/vcap/store/rabbitmq/mnesia/db/nodes_running_at_shutdown
[rabbit@2d6a7c96149d5cb12be2c06bf9b19042].
If this node has more than one node, then repeat the process again until you find a RabbitMQ server that contains only one node listed in this file.

Once you locate the node with only one entry in nodes_running_at_shutdown, run bosh stop on the last node and bosh start on the node you just located.

IMPORTANT NOTE: Now it is safe to proceed with the steps to Upgrade the RabbitMQ for PCF Pre-Provisioned Service.