"No data" issue shown for bosh-director-health on healthwatch 2.x due to the ID issue of p-bosh-GUID

search cancel

"No data" issue shown for bosh-director-health on healthwatch 2.x due to the ID issue of p-bosh-GUID

book

Article ID: 298315

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

The issue is "No data" shown for bosh-director-health on healthwatch 2.x.

Troubleshooting:
ssh to one of TSDB VMs and run this curl command by replacing __BOSH_DIRECTOR_IP__ with actual bosh director IP

curl -vk https://__BOSH_DIRECTOR_IP__:53035/metrics --cacert /var/vcap/jobs/prometheus/config/certs/prometheus_ca.pem --cert /var/vcap/jobs/prometheus/config/certs/prometheus_certificate.pem --key /var/vcap/jobs/prometheus/config/certs/prometheus_certificate.key

If we can find the bosh deployment name is p-bosh-<GUID> from the output of curl command, the issue comes from bosh deployment name on Healthwatch 2.x.

# TYPE system_cpu_core_idle gauge
system_cpu_core_idle{cpu_name="cpu0",deployment="p-bosh-<GUID>”,index="fdcc400c-****-****-****-************”,ip="",job="loggr-system-metrics-agent",origin="system_metrics_agent",source_id="system_metrics_agent",unit="Percent"} 99.53208556145128
system_cpu_core_idle{cpu_name="cpu1",deployment="p-bosh-<GUID>",index="fdcc400c-****-****-****-************",ip="",job="loggr-system-metrics-agent",origin="system_metrics_agent",source_id="system_metrics_agent",unit="Percent"} 98.32775919731921
......

Root Cause:
Prior to Ops Manager 1.8.4, the BOSH product had an ID of p-bosh-GUID. After OM 1.8.4, new installations had an ID of just p-bosh. Currently Bosh director deployment name should be p-bosh for Healthwatch2.x , but any customer that has upgraded an environment pre-OM 1.8.4 to today would still retain the p-bosh-GUID identifier. That’s why the query in grafana returns 'No data' due to p-bosh-GUID.

Environment

Product Version: 2.11

Resolution

This ID issue of p-bosh-GUID is planning to be fixed in the next regular release for HealthWatch around the beginning of June, 2023.

Temporary workaround:
You can clone the dashboard and modify the query like this: system_healthy{deployment=~"p-bosh.*"}.
Notice: the cloned dashboard will be deleted after each reboot of Grafana VM, that’s why it’s a temporary workaround.
1. click 'dashboard setting'