Rabbitmq-server job on BOSH-deployed RabbitMQ service instance VM is reported being restarted repeatedly.
In rabbit@<NODE>.rabbitmq-server.services.service-instance-<GUID>.bosh.log, there are no any errors before the restart. According to monit log, monit daemon detected rabbitmq-server process is not running frequently.
$ grep "process is not running" monit.log ... [UTC Nov 19 11:47:06] error : 'rabbitmq-server' process is not running [UTC Nov 19 11:52:39] error : 'rabbitmq-server' process is not running [UTC Nov 19 12:03:03] error : 'rabbitmq-server' process is not running [UTC Nov 20 04:11:24] error : 'rabbitmq-server' process is not running [UTC Nov 23 11:22:57] error : 'rabbitmq-server' process is not running [UTC Nov 23 21:11:17] error : 'rabbitmq-server' process is not running [UTC NOv 23 27:28:40] error : 'rabbitmq-server' process is not running
In the meanwhile of each restart, startup_stderr.log records a CPU segmentation fault.
[os_mon] cpu supervisor port (cpu_sup): Erlang has closed
[os_mon] memory supervisor port (memsup): Erlang has closed
Segmentation fault
There is an issue with ESXi v8.0.2 and it is fixed in v8.0.3, please refer the github issue. It can be triggered when Erlang process invokes AVX-512 CPU instructions. To resolve the issue, please upgrade ESXi from v8.0.2 to higher versions.