Queue Depth Stuck and Crashes in RabbitMQ Classic Queues
search cancel

Queue Depth Stuck and Crashes in RabbitMQ Classic Queues

book

Article ID: 412100

calendar_today

Updated On:

Products

VMware Tanzu Platform - Cloud Foundry

Issue/Introduction

RabbitMQ queue depth increases over time and eventually becomes stuck. One or more queues may crash, causing messages to remain unprocessed. 

 

errorContext: child_terminated
reason: {function_clause, [{rabbit_msg_store,reader_pread_parse,[[eof]],...}]}
Restarting crashed queue '<queue_name>' in vhost '<vhost_id>'

Environment

 RabbitMQ 4.X

Classic queues in use

 

Cause

Under certain conditions, RabbitMQ may try to read beyond the end of a classic queue file due to an off-by-9 error.

  • This causes RabbitMQ to return the current accumulator rather than parsing remaining data.

  • Some messages at the end of the file may be truncated (though still in memory and pointing to the original file past the truncation point).

  • This results in queue depth inconsistencies, stuck queues, or message loss until the node restarts.

This issue is a known bug in RabbitMQ classic queues and is scheduled to be fixed in the next Tanzu release.

Resolution

Immediate Workarounds

  • Restart the affected RabbitMQ node: This restores normal operation but may still lead to limited message loss.

  • Use Quorum Queues for critical workloads: Quorum queues are the recommended RabbitMQ queue type for data safety and resilience.

    • If message loss is unacceptable, migrate to quorum queues instead of classic queues.

Permanent Fix

  • A fix for this bug is scheduled to be included in the next Tanzu RabbitMQ upcoming release

  • Upgrade to that release once available.