This issue is explained in the following file: https://github.com/rabbitmq/rabbitmq-federation/issues/111
When the federation needs to move to another node, it is using a different key. This causes the crash and the way to recover is to recreate the federation or upgrade to RMQ 3.8.6.
This issues persists in 3.8.3.
The related messages inside the logs are below:
2020-08-24 05:36:00.255 [info] <0.713.0> Mirrored queue '3rd.requests.service.exchange-queue' in vhost 'PROD': Adding mirror on node [email protected]: <0.773.0> ** When handler state == [] 253: ** Reason == {{badmatch,{error,{{{badmatch,{error,{{unable_to_parse_uri,no_scheme},[138,...... 2020-08-24 05:42:56.017 [info] <0.773.0> Mirrored queue '3rd.requests.service.exchange-queue' in vhost 'PROD': Slave <[email protected]> saw deaths of mirrors <[email protected]> 2020-08-24 05:42:56.019 [info] <0.773.0> Mirrored queue '3rd.requests.service.exchange-queue' in vhost 'PROD': Promoting slave <[email protected]> to master
The badmatch and promotion are visible only for the 3rd.requests.service.exchange-queue but not for the second federation exchange queue.
The workaround for the current version is to recreate affected queues. For a permanent fix, upgrade to the next version that contain RabbitMQ 3.8.6. This version of RMQ is not yet bundled into the 1.20 RMQ tile, it will be completed in the next few weeks.
Badmatch first occurred on [email protected] which later promoted itself as master. When the federation needed to move to another node, it used a different key - this is what causes the crash.