RabbitMQ: Recurring “shovel spec clean up failed with exit:noproc” Warnings in Logs

Products

VMware Tanzu Data Suite RabbitMQ VMware Tanzu RabbitMQ

Issue/Introduction

RabbitMQ logs show recurring warnings similar to the following when running 4.1.4 with Khepri:

[warning] <xxx> Recurring shovel spec clean up failed with exit:{noproc,
[warning] <xxx>                           {gen_server,call,
[warning] <xxx>                             [rabbit_shovel_dyn_worker_sup_sup,which_children,infinity]}}
These warnings can appear periodically (for example, once per minute) even when there are no actively running shovels in the cluster.

Environment

RabbitMQ 4.1.4 (Khepri, open‑source distribution)
Erlang/OTP 27 (or a similar supported version)
rabbitmq_shovel and rabbitmq_shovel_management plugins enabled

Cause

In RabbitMQ 4.1.4, the warning Recurring shovel spec clean up failed with exit:{noproc,{gen_server,call,[rabbit_shovel_dyn_worker_sup_sup,which_children,infinity]}} indicates that the dynamic shovel supervisor process (rabbit_shovel_dyn_worker_sup_sup) is not running when a periodic cleanup job executes.

This supervisor is part of the Shovel plugin and manages dynamic shovels defined via runtime parameters; the cleanup job calls gen_server:call/3 with which_children to iterate the supervisor’s children and clean up shovel specs, and if the supervisor has crashed or never started correctly, the call fails with noproc.
The failure is typically triggered by lifecycle or timing issues around dynamic shovels, such as node restarts, runtime parameter import/update, or supervisor crashes, and has been observed even when there are no active shovel definitions but the Shovel plugin is enabled.

The implementation relies on a mirrored‑supervisor–style design for shovel management, and there are known edge cases where the dynamic shovel supervisor can terminate and is not restarted automatically, allowing the recurring cleanup job to continue attempting gen_server:call and produce repeated noproc warnings.
When rabbitmq_shovel and rabbitmq_shovel_management are enabled, the dynamic shovel supervisor and its periodic cleanup job start even if no shovel definitions are configured.

On affected versions, this background cleanup can intermittently fail with noproc if the supervisor process has exited, leading to recurring warnings in otherwise idle clusters

Resolution

On open‑source RabbitMQ 4.1.4, no configuration change fully eliminates these warnings while keeping the legacy Shovel implementation enabled.
Tanzu RabbitMQ 4.2 introduces a redesigned, distributed Shovel implementation that distributes shovel workers across the cluster and removes several limitations of the legacy mirrored‑supervisor‑based design, which reduces the likelihood of single‑node supervisor failures causing recurring cleanup warnings.
The distributed Shovel feature is available only in Tanzu RabbitMQ and is not part of the upstream open‑source RabbitMQ 4.1.x line.
For high‑availability environments that rely heavily on shovels and are sensitive to these warnings, upgrading to Tanzu RabbitMQ 4.2.0 or later can provide a more robust Shovel implementation.

Workaround

If Shovel is not required on the cluster, the simplest and recommended approach is to disable the Shovel plugins, which prevents the dynamic shovel supervisor from starting and stops the recurring cleanup job, eliminating the warnings (at the cost of losing Shovel functionality).

1. Disable Shovel plugins:

rabbitmq-plugins disable rabbitmq_shovel rabbitmq_shovel_management

2. Restart the RabbitMQ service so the plugins are fully unloaded:

systemctl restart rabbitmq-server   # Or equivalent for your OS/distribution

If Shovel might be needed in the future, but you want to mitigate the warnings without fully disabling it, you can manually restart the rabbitmq_shovel application on the affected node(s) to recreate the dynamic shovel supervisor.
This typically clears the warnings until the next supervisor crash or lifecycle event:

rabbitmqctl eval 'application:stop(rabbitmq_shovel), application:start(rabbitmq_shovel).'

Note that this workaround is node‑local and may need to be repeated if the dynamic shovel supervisor exits again due to the underlying lifecycle issue.

Additional Information

References