PASW deployment VMs in failing state with prom_scraper job stopped
search cancel

PASW deployment VMs in failing state with prom_scraper job stopped

book

Article ID: 298036

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

loggregator-agent-release introduces a bug in PASW 2.7 that misconfigures the prom_scraper job, preventing it from start.

We have seen this issue triggered when upgrading a stemcell rather than upgrading to a PAS / TAS for Windows version containing this bug. These are the key points to identify the issue:

1. Apply Changes completed successfully but you can see PASW VMs are in a failing state

2. "bosh d <pas-windows-deployment> is --ps" shows the "prom_scraper" job being in a stopped state for all VMs. For example:

$ bosh -d pas-windows-01235 is --ps
Using environment '10.XXX.XXX.22' as user 'XXX'

Task 2978666. Done

Deployment 'pas-windows-01235'

Instance                                                 Process                            Process State  AZ              IPs           Deployment
windows_diego_cell/ae9678c7-a87f-4430-a7cb-eedba8834d5d  -                                  failing        bu-abc-c3-sbx0  10.XXX.XXX.85  pas-windows-01235
~                                                        bosh-dns-healthcheck-windows       running        -               -             -
~                                                        bosh-dns-nameserverconfig-windows  running        -               -             -
~                                                        bosh-dns-windows                   running        -               -             -
~                                                        clamd                              running        -               -             -
~                                                        dynatrace-oneagent-windows         running        -               -             -
~                                                        event_log_forwarder                running        -               -             -
~                                                        forwarder-agent                    running        -               -             -
~                                                        freshclam                          running        -               -             -
~                                                        garden-windows                     running        -               -             -
~                                                        loggregator_agent                  running        -               -             -
~                                                        prom_scraper                       stopped        -               -             -
~                                                        rep_windows                        running        -               -             -
~                                                        route_emitter_windows              running        -               -             -
~                                                        syslog-agent                       running        -               -             -

 

3. \var\vcap\sys\log\prom_scraper_windows\prom_scraper\job-service-wrapper.err.log shows the following error:

/var/vcap/jobs/prom_scraper/config/certs/scrape.crt: The system cannot find the path specified.


Environment

Product Version: 2.7

Resolution

This issue is solved in TAS for Windows 2.7.21.
 
A temporary workaround could be to update "C:\var\vcap\jobs\prom_scraper_windows\monit" and fix the wrong path. This would be to change any occurrence of "/var/vcap/jobs/prom_scraper" to "/var/vcap/jobs/prom_scraper_windows" and then restart the prom_scraper service.
$ bosh -d pas-windows-01235bac0292cbc2gee94 ssh windows_diego_cell/0
$ powershell
$ ((Get-Content -path C:\var\vcap\jobs\prom_scraper_windows\monit -raw) -replace '/var/vcap/jobs/prom_scraper' , '/var/vcap/jobs/prom_scraper_windows') | Set-Content -path C:\var\vcap\jobs\prom_scraper_windows\monit
$ Restart-Service prom_scraper