Several customers have reported delayed analysis and high message volumes in RabbitMQ retry queues on their on-premise NSX Lastline installations. Affected environments exhibited symptoms such as:
Sluggish analysis pipelines
Eventual SIEM alert delays
Accumulating queue sizes, particularly within retry.N queues
Example queue status from rabbitmqctl: (When you run the queue command to check what queues are filling up, cmd: rabbitmqctl -p llq_v1 list_queues | grep -v -w 0$, in the output, you'll see retry queues as below):
Upon further investigation, it was discovered that the messages piling up in retry queues—especially retry.5—had the routing key: inserted_pcaps.c2736
These messages are handled by the queue_worker_inserted-pcaps, which relies on the pcapapi component to interface with Suricata for analysis. The processing failures were traced back to the following:
The worker was receiving 500 INTERNAL SERVER ERROR responses from the pcapapi2 service.
These errors originated from a Python exception in suricata.py: AttributeError: 'AlprotoDns' object has no attribute 'rdata'
The error stems from a mismatch between the pcapapi plugin logic and newer versions of Suricata-EVE JSON structures.
As a result, the message processing fails, enters an exponential backoff retry loop, and eventually overflows the retry queues.
Additional observations:
No significant logs were found in /var/log/suricata or through journalctl.
Suricata in this case was running directly on the manager node (not in a container).
If you encounter this issue, kindly contact Broadcom support.
Additional Context: Retry Queues in RabbitMQ
The retry.N queues are part of an exponential backoff system for transient errors.
They store failed messages temporarily before retrying with increasing delay.
Messages are routed back to their original queue after TTL expiry for reprocessing.
Once max retries are exceeded, messages may be silently dropped if no binding exists for the current retry_count.