loggregator-agent-release introduces a bug in PASW 2.7 that misconfigures the prom_scraper job, preventing it from start.
We have seen this issue triggered when upgrading a stemcell rather than upgrading to a PAS / TAS for Windows version containing this bug. These are the key points to identify the issue:
1. Apply Changes completed successfully but you can see PASW VMs are in a failing state
2. "bosh d <pas-windows-deployment> is --ps" shows the "prom_scraper" job being in a stopped state for all VMs. For example:
$ bosh -d pas-windows-01235 is --ps Using environment '10.XXX.XXX.22' as user 'XXX' Task 2978666. Done Deployment 'pas-windows-01235' Instance Process Process State AZ IPs Deployment windows_diego_cell/ae9678c7-a87f-4430-a7cb-eedba8834d5d - failing bu-abc-c3-sbx0 10.XXX.XXX.85 pas-windows-01235 ~ bosh-dns-healthcheck-windows running - - - ~ bosh-dns-nameserverconfig-windows running - - - ~ bosh-dns-windows running - - - ~ clamd running - - - ~ dynatrace-oneagent-windows running - - - ~ event_log_forwarder running - - - ~ forwarder-agent running - - - ~ freshclam running - - - ~ garden-windows running - - - ~ loggregator_agent running - - - ~ prom_scraper stopped - - - ~ rep_windows running - - - ~ route_emitter_windows running - - - ~ syslog-agent running - - -
3. \var\vcap\sys\log\prom_scraper_windows\prom_scraper\job-service-wrapper.err.log shows the following error:
/var/vcap/jobs/prom_scraper/config/certs/scrape.crt: The system cannot find the path specified.
$ bosh -d pas-windows-01235bac0292cbc2gee94 ssh windows_diego_cell/0 $ powershell $ ((Get-Content -path C:\var\vcap\jobs\prom_scraper_windows\monit -raw) -replace '/var/vcap/jobs/prom_scraper' , '/var/vcap/jobs/prom_scraper_windows') | Set-Content -path C:\var\vcap\jobs\prom_scraper_windows\monit $ Restart-Service prom_scraper