Concourse Web Process fails to start during deployment when Prometheus is enabled
search cancel

Concourse Web Process fails to start during deployment when Prometheus is enabled

book

Article ID: 297248

calendar_today

Updated On:

Products

Concourse for VMware Tanzu

Issue/Introduction

The following versions of Concourse BOSH Release are effected by this issue:
  • Concourse BOSH Release 7.4.0
  • Concourse BOSH Release 7.4.4

When deploying Concourse using the Concourse BOSH Release, the web process fails to start with the following error in /var/vcap/sys/log/web/web.stderr.log:
panic: descriptor Desc{fqName: "concourse_error_logs", help: "Number of error logged", constLabels: {}, variableLabels: [message]} is invalid: "bosh-deployment" is not a valid label name for metric "concourse_error_logs"

goroutine 1 [running]:
github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(0xc00009b0e0, 0xc00084b710, 0x1, 0x1)
	github.com/prometheus/[email protected]/prometheus/registry.go:403 +0xb7
github.com/prometheus/client_golang/prometheus.MustRegister(...)
	github.com/prometheus/[email protected]/prometheus/registry.go:178
github.com/concourse/concourse/atc/metric/emitter.(*PrometheusConfig).NewEmitter(0xc000500c80, 0xc00084e120, 0x0, 0x0, 0x1, 0xc00084b6b0)
	github.com/concourse/concourse/atc/metric/emitter/prometheus.go:133 +0x184
github.com/concourse/concourse/atc/metric.(*Monitor).Initialize(0xc0000b0280, 0x36cca90, 0xc00064fa40, 0xc00085a300, 0x24, 0xc00084e120, 0x3e8, 0xc00064fa40, 0xc0008660d0)
	github.com/concourse/concourse/atc/metric/emit.go:143 +0x4c5
github.com/concourse/concourse/atc/atccmd.(*RunCommand).configureMetrics(0xc00004e580, 0x36cca90, 0xc00064f4a0, 0x36614c0, 0x335b5d8)
	github.com/concourse/concourse/atc/atccmd/command.go:1599 +0xd5
github.com/concourse/concourse/atc/atccmd.(*RunCommand).Runner(0xc00004e580, 0xc00084ac30, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0)
	github.com/concourse/concourse/atc/atccmd/command.go:571 +0x545
main.(*WebCommand).Runner(0xc0003ca008, 0xc00084ac30, 0x0, 0x1, 0x2, 0x6, 0xc00084aaa0, 0xc0000a6980)
	github.com/concourse/concourse/cmd/concourse/web.go:62 +0xa5
main.(*WebCommand).Execute(0xc0003ca008, 0xc00084ac30, 0x0, 0x1, 0x2778480, 0x2af9320)
	github.com/concourse/concourse/cmd/concourse/web.go:44 +0x65
github.com/vito/twentythousandtonnesofcrudeoil.installEnv.func2(0x7f5129558518, 0xc0003ca008, 0xc00084ac30, 0x0, 0x1, 0x1, 0x182)
	github.com/vito/twentythousandtonnesofcrudeoil@v0.0.0-20180305154709-3b21ad808fcb/environment.go:40 +0x8a
github.com/jessevdk/go-flags.(*Parser).ParseArgs(0xc00066b7a0, 0xc000142010, 0x1, 0x1, 0xc00075b7c0, 0x0, 0x0, 0x0, 0x0)
	github.com/jessevdk/[email protected]/parser.go:340 +0x85d
github.com/jessevdk/go-flags.(*Parser).Parse(...)
	github.com/jessevdk/[email protected]/parser.go:190
main.main()
	github.com/concourse/concourse/cmd/concourse/main.go:30 +0x1f5


Impact

Attempting to deploy an impacted version with the Prometheus client enabled will fail due to the web process not being able to start. This includes both new deployments or upgrades of existing deployments.


Cause

Newer versions of the Prometheus Go client library institute strict checks on metric label names. This label name convention considers dashes to be invalid characters. The Concourse BOSH Deployment associated with the impacted versions defines metric labels with dashes in the names.

Environment

Product Version: Other

Resolution

If an upgrade to one of the impacted versions of Concourse is strictly necessary, then the Prometheus client must be disabled to proceed. The client can be disabled in two different ways, depending on how Concourse is being deployed. How the client is disabled is dependent on if Concourse was deployed using operations files or the manifest.


Method 1 - Concourse Deployed Using Operations Files

A typical Concourse BOSH deployment will utilize the operations files that are packaged in the download. In this case, there will be a line similar to the following in the deployment script or command line:
-o ./cluster/operations/prometheus.yml \
Locate and remove this line and rerun the shell script or deploy command.


Method 2 - Concourse Deployed via Manifest

Please read before proceeding: It is best practice to make sure that a deployment manifest file is backed up or committed to a versioning system to facilitate quick recovery in the event of an error during modification.

If Concourse is being deployed using a BOSH manifest, then the Prometheus property must be removed from the web job definition inside the manifest file. Look for lines similar to the following within the web job:
prometheus:
  bind_ip: 0.0.0.0
  bind_port: 9090

The values may vary depending on the deployment. Once located, remove these lines and re-run the deploy with the updated manifest.