Concourse deployments fails or become unstable when using Ubuntu-Bionic Stemcells
search cancel

Concourse deployments fails or become unstable when using Ubuntu-Bionic Stemcells

book

Article ID: 297250

calendar_today

Updated On:

Products

Concourse for VMware Tanzu

Issue/Introduction

Impacted Versions

  • Concourse for VMware Tanzu 7.4.4 and earlier


Summary

Deploying Concourse using an ubuntu-bionic stemcell, as either a new deployment or upgrade, will render the deployment inoperable.


Scenario 1

When deploying Concourse with a colocated UAA and Credhub, the Web VM fails to get updated and shows this error:

Task 2524 | 16:08:21 | Updating instance web: web/c1b3a6a4-0250-4688-bc4f-54f40482ddef (0) (canary) (00:13:46)
                     L Error: Action Failed get_task: Task c38c9ab9-00f0-4c77-4dbd-c2856a721554 result: 1 of 2 post-start scripts failed. Failed Jobs: uaa. Successful Jobs: credhub.


Note: The CredHub process might also show as failed but the issue will only show up in the UAA logs. 

To confirm, first start by logging in to the Operations Manager (Ops Manager) VM via SSH and authenticate to BOSH. For more information on how to do this, refer to the following resources: 


After the authentication is done, use bosh -d <deployment name> ssh <web instance> -c 'sudo tail -n 500 /var/vcap/sys/log/uaa/uaa.log' to review the UAA logs.

If you see a similar series of errors like the ones below, a probable cause is that the BOSH DNS job was never placed on any of the VMs:

web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | [2022-02-11T16:01:41.089239Z] uaa - 16 [main] ....  WARN --- XmlWebApplicationContext: Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.cloudfoundry.identity.uaa.security.web.SecurityFilterChainPostProcessor#0' defined in ServletContext resource [/WEB-INF/spring-servlet.xml]: Cannot resolve reference to bean 'identityZoneResolvingFilter' while setting bean property 'additionalFilters' with key [TypedStringValue: value [#{T(org.cloudfoundry.identity.uaa.security.web.SecurityFilterChainPostProcessor.FilterPosition).position(5)}], target type [null]]; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'identityZoneResolvingFilter' defined in ServletContext resource [/WEB-INF/spring-servlet.xml]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'identityZoneProvisioning' defined in URL [jar:file:/var/vcap/data/uaa/tomcat/webapps/ROOT/WEB-INF/lib/cloudfoundry-identity-server-75.12.0.jar!/org/cloudfoundry/identity/uaa/zone/JdbcIdentityZoneProvisioning.class]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'flyway' defined in org.cloudfoundry.identity.uaa.db.beans.FlywayConfiguration$FlywayConfigurationWithMigration: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.flywaydb.core.Flyway]: Factory method 'flyway' threw exception; nested exception is org.flywaydb.core.internal.exception.FlywaySqlException:
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | Unable to obtain connection from database: The connection attempt failed.
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | -------------------------------------------------------------------------
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | SQL State  : 08001
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | Error Code : 0
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | Message    : The connection attempt failed.
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout |
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | [2022-02-11T16:01:41.094146Z] uaa - 16 [main] .... ERROR --- DispatcherServlet: Context initialization failed
.
.
.
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | Caused by: org.flywaydb.core.internal.exception.FlywaySqlException:
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | Unable to obtain connection from database: The connection attempt failed.
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | -------------------------------------------------------------------------
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | SQL State  : 08001
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | Error Code : 0
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | Message    : The connection attempt failed.
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout |
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout |  at org.flywaydb.core.internal.jdbc.JdbcUtils.openConnection(JdbcUtils.java:60) ~[flyway-core-5.2.4.jar:?]
.
.
.
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | Caused by: org.postgresql.util.PSQLException: The connection attempt failed.
.
.
.
web/fb946539-b604-420f-8af9-f4d5753c0af2: stdout | Caused by: java.net.UnknownHostException: q-s0.db.<network name>.<deplyment name>.bosh

 
This can be verified by running bosh -d <deployment name> instances --ps and confirming that bosh-dns is not in the process list:

Deployment 'concourse'

Instance                                     Process     Process State  AZ   IPs           Deployment
db/ca1a0bc4-####-####-####-1ac896b28b81      -           running        az1  10.###.##.#   concourse
~                                            pg_janitor  running        -    -             -
~                                            postgres    running        -    -             -
web/fb946539-####-####-####-f4d5753c0af2     -           failing        az1  10.###.##.##  concourse
~                                            credhub     unknown        -    -             -
~                                            uaa         failing        -    -             -
~                                            web         failing        -    -             -
worker/6bfcb71d-####-####-####-841f9bf5a120  -           running        az1  10.###.##.##  concourse
~                                            worker      running        -    -             -

3 instances


Scenario 2

Following a successful upgrade or deployment of Concourse using an ubuntu-bionic stemcell, jobs and checks within pipelines begin failing with errors messages similar to the following:

runc state: runc: fork/exec /var/gdn/assets/linux/bin/runc: resource temporarily unavailable:



Environment

Product Version: Other

Resolution

To resolve this issue, revert all deployments using the impacted releases of Concourse (Concourse for VMware Tanzu 7.4.4 and earlier) to the ubuntu-xenial stemcell.