Push Notification Service push Errand failing with error:
com.rabbitmq.client.ShutdownSignalException:: failed to perform operation on queue
Let's take an example where RabbitMQ tile version 1.8.6 for Pivotal Cloud Foundry is deployed. This has OSS RabbitMQ version 3.6.9 installed by default.
After the PCF Push Notification Service tile version 1.7.x is deployed, the smoke-tests errand for Push Notifications fails to try to declare analytics-log-queue
.
Errors from the install logs:
2017-05-31T15:49:25.44+0000 [APP/0] OUT Caused by: com.rabbitmq.client.ShutdownSignalException: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - failed to perform operation on queue ‘analytics-log-queue’ in vhost ‘9e02853a-xxxx-xxxx-xxxx-7c14a8e5c261’ due to timeout, class-id=50, method-id=10)
Errors in rabbitmq-server logs:
=ERROR REPORT==== 31-May-2017::15:49:25 === Channel error on connection <0.20489.2> (1.2.3.4:48116 -> 1.2.3.5:5672, vhost: ‘9e02853a-xxxx-xxxx-xxx-7c14a8e5c261’, user: ‘7ab47a83-xxxx-xxxx-xxxx-f7588d7810d3’), channel 1: operation queue.declare caused a channel exception not_found: “failed to perform operation on queue ‘analytics-log-queue’ in vhost ‘9e02853a-xxxx-xxxx-xxxx-7c14a8e5c261’ due to timeout”
This is a known issue due to a deadlock situation in OSS RabbitMQ versions 3.6.6 - 3.6.9. It is permanently fixed in OSS RabbitMQ version 3.6.10 and above.
The following versions of RabbitMQ tile for PCF have the permanent fix installed (OSS RabbitMQ 3.6.10):
For RabbitMQ 1.7.x for PCF - 1.7.22 and above For RabbitMQ 1.8.x for PCF - 1.8.12 and above
As a workaround, the above queue can be manually deleted from the RabbitMQ server:
rabbitmqctl eval 'Q = {resource, <<"9e02853a-xxxx-xxxx-xxxx-7c14a8e5c261">>, queue, <<"analytics-log-queue">>}, rabbit_amqqueue:internal_delete(Q).'
It is recommended to upgrade RabbitMQ tile for PCF to the latest version for a permanent fix.