RabbitMQ service not starting and showing red in vIDM dashboard
search cancel

RabbitMQ service not starting and showing red in vIDM dashboard

book

Article ID: 367757

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Symptoms:

  • during vIDM boot RabbitMQ is not starting 
  • there is an error on vIDM dashboard with "There was a problem Messaging service Error retrieving RabbitMQ status"


  • any interaction with RabbitMQ is failing
    root@idm [ ~ ]# rabbitmqctl stop_app
    Stopping rabbit application on node rabbitmq@vm-idm ...
    Error: unable to perform an operation on node 'rabbitmq@idm'. Please see diagnostics information and suggestions below.
          
    root@idm [ ~ ]# rabbitmqctl force_reset
    Error: unable to perform an operation on node 'rabbitmq@idm'. Please see diagnostics information and suggestions below.
         
    root@idm [ ~ ]# rabbitmqctl start_app
    Starting node rabbitmq@vm-idm ...
    Error: unable to perform an operation on node 'rabbitmq@idm'. Please see diagnostics information and suggestions below.
  • checking RabbitMQ service shows a crash dump was written
  • You can see the following message in the horizon log (/opt/vmware/horizon/workspace/logs/horizon.log)  "Messaging Connection: Messaging connection test failed"
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] com.vmware.horizon.messaging.channel.http.HttpChannel - Stop resending message to: http://127.0.0.1/AUDIT/API/1.0/REST/audit/consume. Status code: 500
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] com.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.06659a45-87c4-4009-bebc-26c0c59284d7] message added back to queue because: Cannot send message to: AnalyticsHttpChannel[callbackUri=http://127.0.0.1/AUDIT/API/1.0/REST/audit/consume,serviceAuthTokenProvider=com.vmware.horizon.components.identity.accesscontrol.ServiceAuthTokenProvider@55a7c5ef,sslUtils=com.vmware.horizon.security.utils.SSLUtils@597aa1d,defaultHttpClient=org.apache.http.impl.client.InternalHttpClient@310fbbe5,authMetadata=,httpPost=] (fail.send.callback.uri). [DeliveryTag:993]
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] com.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.06659a45-87c4-4009-bebc-26c0c59284d7] is retrying current message for 3th time
    <TIMESTAMP> INFO  (subscriber-thread-285) [;;;] com.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.06659a45-87c4-4009-bebc-26c0c59284d7] has one message requeued.
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] com.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.06659a45-87c4-4009-bebc-26c0c59284d7] reached more than 10 errors in a row,

    As more and more messages piled up in RabbitMQ. It could eat up all the hard disk space for RabbitMQ. Thus RabbitMQ connection will be blocked, i.e. unhealthy. Check the log for below symptoms:

    <TIMESTAMP> INFO  (AMQP Connection 127.0.0.1:5672) [;;;] com.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessagingProvider - Connection to localhost unblocked by RabbitMQ
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] com.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.06659a45-87c4-4009-bebc-26c0c59284d7] reached more than 1000 errors in a row, disabling.
    <TIMESTAMP> WARN  (AMQP Connection 127.0.0.1:5672) [;;;] com.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessagingProvider - Connection to localhost blocked by RabbitMQ: low on disk

Environment

VMware Identity Manager 3.3.x

Resolution

  1. Take a Snapshot from the vIDM cluster
  2. Take horizon-workspace service offline on each node: service horizon-workspace stop
  3. Reset RabbitMQ on each node: rabbitmqctl reset
  4. Restart RabbitMQ on each node: systemctl restart rabbitmq-server.service
  5. Start horizon-workspace service on each node: service horizon-workspace start