When performing a rolling restart of a RabbitMQ cluster running on Kubernetes, one or more pods may fail to start and enter a crash loop with the error:
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0>
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0> BOOT FAILED
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0> ===========
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0> Exception during startup:
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0>
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0> exit:timeout_waiting_for_leader
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0>
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0> rabbit_khepri:setup/1, line 278
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0> rabbit:run_prelaunch_second_phase/0, line 396
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0> rabbit:start/2, line 922
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0> application_master:start_it_old/4, line 293
2025-07-08 09:54:46.580662+00:00 [error] <0.2762.0>
BOOT FAILED
===========
Exception during startup:
exit:timeout_waiting_for_leader
rabbit_khepri:setup/1, line 278
rabbit:run_prelaunch_second_phase/0, line 396
rabbit:start/2, line 922
application_master:start_it_old/4, line 293
and later on all the pods can run into the same boot failure and it tends to be an infinite crash loop.
This behavior is generally observed when Khepri is enabled as the metadata store. In such cases, a majority of the cluster nodes must be online for successful leader election. As described in the RabbitMQ documentation, see <Restarting a Cluster Member>
"
When a cluster member is restarted or stopped, the remaining nodes may lose their quorum. This may affect the ability to start a node.
For example, in a cluster of 5 nodes where all nodes are stopped, the first two starting nodes will wait for the third node to start before completing their boot and start to serve messages. That’s because the metadata store needs at least 3 nodes in this example to elect a leader and complete the initialization process. In the meantime the first two nodes wait and may time out if the third one does not appear.
"
However, in lower RabbitMQ version(below 4.0.6), the default timeout value for khepri to elect the leader, 30s, was too-short and sometimes the cluster members need more time to start, which result in leader election failure and then crash with error,timeout_waiting_for_leader.
Temporary Workaround
If pods just need more time to start, the khepri_leader_wait_retry_timeout setting can be increased. This is an advanced RabbitMQ configuration, and in versions below 4.0.6, it defaults to 30,000 milliseconds (30 seconds).
You may increase this value—for example, to 300,000 milliseconds (5 minutes)—to allow more time for leader election. For RabbitMQ clusters deployed via the RabbitMQ Kubernetes Operator, this can be configured as follows:
---
apiVersion: rabbitmq.com/v1beta1s
kind: RabbitmqCluster
...
spec:
rabbitmq:
advancedConfig: |
[
{rabbit, [
...
{khepri_leader_wait_retry_timeout, 300000}
]},
...
Once updated, make it affective by kubectl apply or do a rolling restart, make sure such change has been done by:
kubectl -n <namespace> get rabbitmqcluster <cluster-name> -o yaml|grep timeout
Resolution
To permanently resolve the issue, upgrade to RabbitMQ version 4.0.6 or higher, where the default timeout for Khepri leader election has been increased to 5 minutes. Alternatively, upgrade to RabbitMQ 4.1.0 or later, where the flawed retry mechanism has been improved, eliminating this issue altogether.