API Developer Portal 4.X failed to start up

book

Article ID: 143612

calendar_today

Updated On:

Products

CA API Developer Portal

Issue/Introduction

API Developer Portal 4.X doesn't start up.
- Many containers are failing to connect to the postgres DB container.
- postgres container received SIGKILL and returned status 137
- "sed" command fails in the postgres container
- df command shows that the /var volume is completely full
- /var/lib/docker/volumes/portal_database-postgres-volume/_data/pg_hba.conf shows that the file has grown considerably in size (should be only ~4K)
- There are a lot of large pg_hba.confxxx swap files within the /var/lib/docker/volumes/portal_database-postgres-volume/_data directory

Cause

pg_hba.conf(*1) file contains 268435551 lines (2^28+95)

*1 The target file of the failed "sed" command

(1) At the line 15 of entrypoint.sh in the postgreSQL container, the "sed" command (*2) is executed.
(*2) substitutes a particular string with another string containing a line break.
The number of lines in pg_hba.conf is doubled every time the entrypoint.sh is executed.

It seems (*2) caused the failure of starting up the API Developer Portal.

*2 sed -i 's/local\s*all.*/local\tall\t\tpostgres\t\t\t\tpeer\nlocal\tall\t\tall\t\t\t\t\tmd5/' ${PGDATA}/pg_hba.conf

Environment

Release : 4.X

Component : API PORTAL

Resolution

The same settings in the following files are duplicated every time the API Portal is restarted.
/var/lib/docker/volumes/portal_database-postgres-volume/_data/pg_hba.conf
/var/lib/docker/volumes/portal_database-postgres-slave-volume/_data/pg_hba.conf

After a lot of restarts, these files cause a "disk full" condition and the API Portal fails to start up.

The duplicated lines are shown as below:
local all postgres peer
local all all md5

As a workaround, please remove excessive pairs of these two lines from the pg_hba.conf files regularly for preventing disk full.

Clean up process:

1. Backup pg_hba.conf in case of typos during cleanup, give it a name like pg_hba.orig, otherwise you might delete the backup if you do an rm pg_hba.conf*
2. Remove the pg_hba.confxxx files, the ones with a random alphanumeric string appended to the filename
3. Remove duplicate entries caused by the bad sed statement in the pg_hba.conf file
         sudo sed -i '/local\s*all.*/ d' <path to>/pg_hba.conf
4. Insert two lines required to pg_hba.conf
         sudo echo -e "local\tall\t\tpostgres\t\t\t\tpeer\nlocal\tall\t\tall\t\t\t\t\tmd5" | sudo tee -a <path to>/pg_hba.conf
5. cat pg_hba.conf to verify that it ends like this (as of 4.4):
# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
# IPv6 local connections:
host    all             all             ::1/128                 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
#local   replication     admin                                trust
#host    replication     admin        127.0.0.1/32            trust
#host    replication     admin        ::1/128                 trust

host all all all md5
host replication all 0.0.0.0/0 trust
local   all             postgres                                peer
local   all             all                                     md5

You may need to repeat the above for the postgres slave, which will be in the /var/lib/docker/volumes/portal_database-postgres-slave-volume/_data directory.

 

Additional Information

The PostgreSQL containers are provided for testing purpose. This problem doesn't occur with external MySQL database for production.
Fixed in API Portal 4.5