API Developer Portal: PostgreSQL DB container failing to start or connect, Portal fails to start properly

search cancel

API Developer Portal: PostgreSQL DB container failing to start or connect, Portal fails to start properly

book

Article ID: 143612

calendar_today

Updated On: 04-08-2024

Products

CA API Developer Portal

Issue/Introduction

API Developer Portal 4.X doesn't start up.
- Many containers are failing to connect to the postgres DB container.
- postgres container received SIGKILL and returned status 137
- "sed" command fails in the postgres container (i.e. "sed: write error")
- df command shows that the /var volume is completely full
- /var/lib/docker/volumes/portal_database-postgres-volume/_data/pg_hba.conf shows that the file has grown considerably in size (should be only ~4K)
- There are a lot of large pg_hba.confxxx swap files within the /var/lib/docker/volumes/portal_database-postgres-volume/_data directory

NOTE: this can also happen with a portal 4.4 install even if you have enough space on HDD, script is hard coded with limit so this article can help.

Environment

Release : 4.X

Component : API PORTAL

Cause

pg_hba.conf(*1) file contains 268435551 lines (2^28+95)

*1 The target file of the failed "sed" command

(1) At the line 15 of entrypoint.sh in the postgreSQL container, the "sed" command (*2) is executed.
(*2) substitutes a particular string with another string containing a line break.
The number of lines in pg_hba.conf is doubled every time the entrypoint.sh is executed.

It seems (*2) caused the failure of starting up the API Developer Portal.

*2 sed -i 's/local\s*all.*/local\tall\t\tpostgres\t\t\t\tpeer\nlocal\tall\t\tall\t\t\t\t\tmd5/' ${PGDATA}/pg_hba.conf

Resolution

The same settings in the following files are duplicated every time the API Portal is restarted.
/var/lib/docker/volumes/portal_database-postgres-volume/_data/pg_hba.conf
/var/lib/docker/volumes/portal_database-postgres-slave-volume/_data/pg_hba.conf

After a lot of restarts, these files cause a "disk full" condition and the API Portal fails to start up.

The duplicated lines are shown as below:
local all postgres peer
local all all md5

As a workaround, please remove excessive pairs of these two lines from the pg_hba.conf files regularly for preventing disk full.

Clean up process:

1. Backup pg_hba.conf in case of typos during cleanup, give it a name like pg_hba.orig (mv pg_hba.conf pg_hba.orig), otherwise you might delete the backup if you do an rm pg_hba.conf*
2. Remove the pg_hba.confxxx files, the ones with a random alphanumeric string appended to the filename
rm -f pg_hba.conf*
3. Remove duplicate entries caused by the bad sed statement in the pg_hba.conf file
sudo sed -i '/local\s*all.*/ d' <path to>/pg_hba.conf
4. Insert two lines required to pg_hba.conf
sudo echo -e "local\tall\t\tpostgres\t\t\t\tpeer\nlocal\tall\t\tall\t\t\t\t\tmd5" | sudo tee -a <path to>/pg_hba.conf
5. cat pg_hba.conf to verify that it ends like this (as of 4.4):
# TYPE DATABASE USER ADDRESS METHOD

# "local" is for Unix domain socket connections only
# IPv4 local connections:
host all all 127.0.0.1/32 trust
# IPv6 local connections:
host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
#local replication admin trust
#host replication admin 127.0.0.1/32 trust
#host replication admin ::1/128 trust

host all all all md5
host replication all 0.0.0.0/0 trust
local all postgres peer
local all all md5

You may need to repeat the above for the postgres slave, which will be in the /var/lib/docker/volumes/portal_database-postgres-slave-volume/_data directory.

Additional Information

The PostgreSQL containers are provided for testing purpose. This problem doesn't occur with external MySQL database for production.
Fixed in API Portal 4.5

Feedback

Was this article helpful?

thumb_up Yes

thumb_down No