Checklist:
In order to find out the source of the issue, go through the following steps.
Let's see an example. Let's say that the "Bosh Unresponsive Agent" chart is showing "No data". So we need to ssh into the tsdb VM in the Healthwatch deployment and take a look at /var/vcap/jobs/prometheus/config/prometheus.yml. Exploring the file will show a section for "director_direct_scrape" and director_system_direct_scrape that contains following info.
- job_name: director_direct_scrape
metrics_path: /metrics
scheme: https
tls_config:
server_name: 10.###.##.###
ca_file: "/var/vcap/jobs/prometheus/config/certs/director_direct_scrape_ca.pem"
cert_file: "/var/vcap/jobs/prometheus/config/certs/director_direct_scrape_certificate.pem"
key_file: "/var/vcap/jobs/prometheus/config/certs/director_direct_scrape_certificate.key"
static_configs:
- targets:
- "10.###.##.###:9091"
- job_name: director_system_direct_scrape
metrics_path: /metrics
scheme: https
tls_config:
server_name: "system-metrics"
ca_file: "/var/vcap/jobs/prometheus/config/certs/director_direct_scrape_ca.pem"
cert_file: "/var/vcap/jobs/prometheus/config/certs/director_direct_scrape_certificate.pem"
key_file: "/var/vcap/jobs/prometheus/config/certs/director_direct_scrape_certificate.key"
static_configs:
- targets:
- "10.#.#.#:53035"
With that information we can build the following curl.
curl -vk https://10.###.##.###:9091/metrics \ --cacert /var/vcap/jobs/prometheus/config/certs/director_direct_scrape_ca.pem \ --cert /var/vcap/jobs/prometheus/config/certs/director_direct_scrape_certificate.pem \ --key /var/vcap/jobs/prometheus/config/certs/director_direct_scrape_certificate.key
curl -vk https://10.###.##.###:53035/metrics \ --cacert /var/vcap/jobs/prometheus/config/certs/director_direct_scrape_ca.pem \ --cert /var/vcap/jobs/prometheus/config/certs/director_direct_scrape_certificate.pem \ --key /var/vcap/jobs/prometheus/config/certs/director_direct_scrape_certificate.key
From the output, we can see if any metric is missing and focus on the job that is not emitting them.
If the port is block by firewall you can ask firewall team to open port 9091 and 53035