bosh concourse deployment upgrade fails on pre-start scripts on Postgres
search cancel

bosh concourse deployment upgrade fails on pre-start scripts on Postgres

book

Article ID: 297240

calendar_today

Updated On:

Products

Concourse for VMware Tanzu

Issue/Introduction

You are upgrading concourse when it fails to update db vm with the error: 
 
Task 27011 | 05:49:41 | Updating instance db: db/ebda3e02-fa91-46a5-83be-924af7525758 (0) (canary) (00:02:33)

           L Error: Action Failed get_task: Task fb8b60b9-d2a7-466c-4fe8-f59964e45df6 result: 1 of 2 pre-start scripts failed. Failed Jobs: postgres. Successful Jobs: bosh-dns.

To get more details on why postgres fails to start
 
bosh -d <concourse deployment> ssh <db-vm>
cd /var/vcap/sys/log/postgres 

check pre-start logs and you see the following errors:
db/ebda3e02-fa91-46a5-83be-924af7525758:/var/vcap/sys/log/postgres# cat pre-start.stdout.log

[2024-03-11T05:49:39.222069978Z] 5932 Checking PostgreSQL is stopped
[2024-03-11T05:49:39.222186088Z] 5932 Pidfile /var/vcap/sys/run/postgres/postgres.pid doesn't exist
[2024-03-11T05:49:39.388185024Z] 5932 kernel.shmmax = 284934144
[2024-03-11T05:49:39.456279993Z] 5932 Running a PostgreSQL major upgrade from postgres-15.5 to postgres-11.15
[2024-03-11T05:49:39.459287881Z] 5932 Initializing data directory /var/vcap/store/postgres/postgres-11.15
[2024-03-11T05:49:39.697875022Z] 5932 The files belonging to this database system will be owned by user "vcap".
[2024-03-11T05:49:39.697952985Z] 5932 This user must also own the server process.
[2024-03-11T05:49:39.698226928Z] 5932 The database cluster will be initialized with locale "en_US.UTF-8".
[2024-03-11T05:49:39.698246955Z] 5932 The default text search configuration will be set to "english".
[2024-03-11T05:49:39.698278903Z] 5932 Data page checksums are disabled.
[2024-03-11T05:49:39.698312044Z] 5932 creating directory /var/vcap/store/postgres/postgres-11.15 ... ok
[2024-03-11T05:49:39.699316978Z] 5932 creating subdirectories ... ok
[2024-03-11T05:49:39.708519935Z] 5932 selecting default max_connections ... 100
[2024-03-11T05:49:39.723670959Z] 5932 selecting default shared_buffers ... 128MB
[2024-03-11T05:49:39.737179040Z] 5932 selecting default timezone ... Etc/UTC
[2024-03-11T05:49:39.737243890Z] 5932 selecting dynamic shared memory implementation ... posix
[2024-03-11T05:49:39.738265037Z] 5932 creating configuration files ... ok
[2024-03-11T05:49:39.884023904Z] 5932 running bootstrap script ... ok
[2024-03-11T05:49:40.382210016Z] 5932 performing post-bootstrap initialization ... ok
[2024-03-11T05:49:40.550873041Z] 5932 syncing data to disk ... ok
[2024-03-11T05:49:40.551069974Z] 5932 Success. You can now start the database server using:
[2024-03-11T05:49:40.551148891Z] 5932   /var/vcap/packages/postgres-11.15/bin/pg_ctl -D /var/vcap/store/postgres/postgres-11.15 -l logfile start
[2024-03-11T05:49:40.601618051Z] 5932 /var/vcap/store/postgres/pg_upgrade_tmp ~
[2024-03-11T05:49:40.606667995Z] 5932 Running in verbose mode
[2024-03-11T05:49:40.609709978Z] 5932 check for "/var/vcap/packages/postgres-15.5/bin" failed: No such file or directory
[2024-03-11T05:49:40.609733104Z] 5932 Failure, exiting
[2024-03-11T06:16:55.529963970Z] 8267 Checking PostgreSQL is stopped
[2024-03-11T06:16:55.530069112Z] 8267 Pidfile /var/vcap/sys/run/postgres/postgres.pid doesn't exist
[2024-03-11T06:16:55.557256937Z] 8267 kernel.shmmax = 284934144
[2024-03-11T06:16:55.558835029Z] 8267 FAIL: DB upgrade stopped in the middle, manual intervention required, quitting...

From the logs above we can see that Postgres has been successfully migrated from v11.15 to v15.5 is attempting to start but fails.

Environment

Product Version: 7.9

Resolution

The error is thrown because there is still an existing lock POSTGRES_UPGRADE_LOCK

Since we know that the upgrade/migrating to new version v15.5 has been successful we can delete lock

1. Find and remove a lock file called POSTGRES_UPGRADE_LOCK
 
find / -name "POSTGRES_UPGRADE_LOCK"

For example, in the case that the lock file is located in ~/var/vcap/store/postgres/:
 
rm ~/var/vcap/store/postgres/POSTGRES_UPGRADE_LOCK

2. After deleting POSTGRES_UPGRADE_LOCK. Do another bosh -d <concourse> deploy concourse.yml to finish the concourse upgrade