Grafana RabbitMQ dashboard not showing details when selecting a new RMQ cluster.
Dashboards use a query like the following:
"sum(rabbitmq_connections * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\"$rabbitmq_cluster\", namespace=\"$namespace\"})",
When accessing the RabbitMQ dashboard in Healthwatch 2.X, the charts are showing N/A and an exclamation mark is present when you hover over it.
It also shows a red popup window with a message that starts with "execution: found duplicate series for the match group (instance=...".
Detailed error message from dashboard:
Status: 500. Message: execution: found duplicate series for the match group {instance="10.###.###.###:9090", job="healthwatch-pas-exporter"} on the right hand-side of the operation: [{__name__="rabbitmq_build_info", deployment="p-rabbitmq-####################", erlang_version="26.2.5.7", exported_job="rabbitmq-server", index="########-####-####-####-############", instance="10.###.###.###:9090", ip="10.###.###.###", job="healthwatch-pas-exporter", origin="p-rabbitmq", prometheus_client_version="4.11.0", prometheus_plugin_version="3.13.8", rabbitmq_version="3.13.8", scrape_instance_group="pas-exporter-gauge", source_id="rabbit@localhost"}, {__name__="rabbitmq_build_info", deployment="p-rabbitmq-####################", erlang_version="26.2.5.7", exported_job="rabbitmq-server", index="########-####-####-####-############", instance="10.###.###.###:9090", ip="10.###.###.###", job="healthwatch-pas-exporter", origin="p-rabbitmq", prometheus_client_version="4.11.0", prometheus_plugin_version="3.13.8", rabbitmq_version="3.13.8", scrape_instance_group="pas-exporter-gauge", source_id="rabbit@localhost"}];many-to-many matching not allowed: matching labels must be unique on one side
Installing the Healthwatch Exporter for the Tanzu Application Service (TAS) tile along with the Healthwatch v2.X tile results in duplicated metrics. This is because Healthwatch scrapes the metrics from both the Prometheus port of the RabbitMQ service instance and from the Loggregator system of TAS.
The preferred method for eliminating the duplicate metrics collection (root cause for the dashboard error) is to adjust the Metrics Polling Interval.
The polling interval is also referenced in the Configuring Healthwatch document:
5. Under Tanzu RabbitMQ, select one of the following options:
• Include: The Grafana instance creates dashboards in the Grafana UI for metrics from Tanzu RabbitMQ.
Note :
If you choose to include Tanzu RabbitMQ dashboards, set the Metrics polling interval field in the Tanzu RabbitMQ tile to -1. This prevents the Tanzu RabbitMQ tile from sending duplicate metrics to the Loggregator Firehose. To configure this field, see the Tanzu RabbitMQ documentation.The detailed setting:
Ops Manager web UI -> Healthwatch tile -> Grafana Dashboards -> Tanzu RabbitMQ -> select ‘Include’ Ops Manager web UI -> RabbitMQ tile -> Metrics -> Metrics polling interval -> set to -1
Alternate methods previously used:
rabbitmq_identity_info{rabbitmq_cluster="$rabbitmq_cluster", namespace="$namespace"}rabbitmq_identity_info{rabbitmq_cluster="$rabbitmq_cluster", namespace="$namespace", job="rabbitmq"}(instance) to (ip).
Change:
sum(rabbitmq_connections * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster="$rabbitmq_cluster", namespace="$namespace"})sum(rabbitmq_connections * on(ip) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster="$rabbitmq_cluster", namespace="$namespace"})