VMware Tanzu RabbitMQ [VMs] tile Pre-Stop script failing
search cancel

VMware Tanzu RabbitMQ [VMs] tile Pre-Stop script failing

book

Article ID: 293246

calendar_today

Updated On:

Products

VMware RabbitMQ

Issue/Introduction

Affected Versions: VMware Tanzu RabbitMQ [VMs] 1.18.5, 1.18.6, 1.19.0 and 1.19.1

When attempting to perform a Bosh operation on a RabbitMQ node, it fails running the pre-stop script.
 
Task 231 | 14:14:29 | Updating instance rabbitmq-server: rabbitmq-server/eba54bce-76e5-42f3-b73f-88b4019e99f6 (0) (canary) (00:00:37)
                    L Error: Action Failed get_task: Task 84a14abf-6177-4bdf-687d-7958c56bceba result: 1 of 1 pre-stop scripts failed. Failed Jobs: rabbitmq-server.

Looking at the logs for the rabbitmq-server, located at /var/vcap/sys/log/rabbitmq-server/pre-stop.stderr.log the below error is seen:
 
queue 'queue' in vhost 'ba99f3ac-8a19-414c-98d5-b3e58d8dfe4a' would lose its only synchronised replica (master) if node rabbit@83790a04cb52316520e78524c962e926 is stopp


Environment

Product Version: 1.18

Resolution

This error is caused if there are classic mirrored queues present without online synchronised mirrors (queues that would potentially lose data if the target node is shut down).

There are two scenarios where this may occur:

Scenario 1
First check and verify all of your RabbitMQ nodes are up and running. You can do this by checking the RabbitMQ Management UI, specifically the Overview page. Another option is to run rabbitmqctl cluster_status. If there is a node down, it may contain the only mirror for a queue.

Bring the node back online with rabbitmqctl start_app to resolve this.

Scenario 2
If all nodes are up and running, check if the queue name mentioned in the pre-start logs on the Management UI is an exclusive queue.



If it is an exclusive queue, next check if a mirroring policy has been applied to it. You can do this by clicking on the queue name, and checking the Effective policy definition section.


In the above example, the queue is expected to be mirrored to other nodes. However, exclusive queue are not mirrored by design [1]. This causes the pre-start script to fail. This behaviour will be changed in a future release of RabbitMQ [2]. 

To resolve this, you have the following options:
  • Once released, upgrade to 1.19.2/1.18.7 version of the tile. These version have an option to allow you to opt out of the pre-stop script.
  • Remove the exclusive queue by stopping the application creating it.
  • Update the first two lines of the script as below. The script is located at /var/vcap/jobs/rabbitmq-server/bin/pre-stop, and you will need to do this on each RabbitMQ server VM. Please note that this will change will be reverted after a deploy.
    • #!/bin/bash
      exit 0


[1] https://www.rabbitmq.com/ha.html#exclusive-queues-are-not-mirrored
[2] https://github.com/rabbitmq/rabbitmq-server/pull/2399