Concourse UI showing "experiencing turbulence" due to over 65K entries in the jobs table [Bug]
search cancel

Concourse UI showing "experiencing turbulence" due to over 65K entries in the jobs table [Bug]

book

Article ID: 297229

calendar_today

Updated On:

Products

Concourse for VMware Tanzu

Issue/Introduction

When the ATC Database (DB) has more that 65536 entries in the jobs table, the jobs api (<concourse-web-ui-url>/api/v1/jobs) returns a 500 error. The operator would see the "experiencing turbulence" screen after login into the Web UI, image below for reference:


Looking at the Web Virtual Machine (VM) logs, the user would see the following error in web.stdout.log:
web.stdout.log:
{"timestamp":"2020-02-24T18:28:51.491805341Z","level":"error","source":"atc","message":"atc.list-all-jobs.failed-to-get-all-visible-jobs","data":{"error":"pq: got 65757 parameters but PostgreSQL only supports 65535 parameters","session":"213"}}

There is a PostgreSQL limitation where the maximum allowed parameters are set to 65535. Therefore, if the database has more than 65535 entries, the SQL query would not be allowed since it passes the limit.

This bug has been documented in the following GitHub issue:
https://github.com/concourse/concourse/issues/5258

Environment

Product Version: 5.5

Resolution

The message "Experiencing Turbulence" can be a result of other issues, such as:
  1. No Network connectivity with the API server
  2. API Server returning a 500 HTTP Error response
  3. An upgrade is being performed
  4. Load Balancer is pointing to a VM that does not exist anymore 
Please ensure that you are not experiencing any of the issues above before implementing the resolution below. 

To confirm that the jobs table exceeded the postgres limit please do the following:
  1. SSH into the DB VM in the Concourse Deployment:
    1. bosh -d <concourse deployment> ssh <db instance name>
  2. Login into the ATC database:
    1. $ /var/vcap/packages/<postgres package>/bin/psql -p 5432 -U vcap -d atc
  3. Use a select statement to get the entries count of the table:
    1. $ atc=# select count (*) from jobs;
  4. The command above should report a value of over 65535


Permanent Fix

This issue will be permanently fixed in Concourse 6.0 or above.
 
Note: This version is not currently in General Availability. The operator will have to follow the workaround described below for now. 


Workaround:

If the operator is not able to upgrade to Concourse 6.0, the entries in the jobs table can be reduced by deleting unused Pipelines with the following command:

$ fly -t <target> destroy-pipeline -p <pipeline name>