PCF Metrics Postgres VM datastore full after upgrade from 1.5.* to 1.6.0
search cancel

PCF Metrics Postgres VM datastore full after upgrade from 1.5.* to 1.6.0

book

Article ID: 293862

calendar_today

Updated On:

Products

Operations Manager

Issue/Introduction

After upgrading PCF Metrics from 1.5.* (1.5.1 and 1.5.2 versions confirmed) to 1.6.0 or later, the persistent disk of the Postgres VM might get full unexpectedly. This is due to the fact that during replacement of Postgres versions as part of the upgrade, the old data directory from the persistent disk does not get removed and all application logs and metrics are still stored there. The new directory for the new version of Postgres will contain the same data which leads to duplication and reduces the amount of free space dedicated for storing application logs.

Environment

Product Version: Other

Resolution

Before we determine for sure what directory has to be removed we need to find out what directory hasn't been modified since the day of upgrade. In order to to this we need to gather some information. 

1. OpsManager apply changes log containing from the upgrade.
  • Note down the date
  • Make sure you see the following is in the log (versions may differ):
Example:

releases:
  - name: pcf-metrics
 -  version: 1.5.1-build.30
 +  version: 1.6.0-build.50
2. Find out what directory has not been used since the date of the upgrade
  • ssh to the postgres VM (bosh -e <environment> -d <metrics deployment> ssh <postgres vm>)
  • change user to root (sudo -i)
  • navigate to the postgres directory on the persistent disk (cd /var/vcap/store/postgres)
  • run ls -lt and confirm which directory hasn't been modified since the date of upgrade 
Example(in this case the directory old directory is postgres-9.6.10 and the date of upgrade was Aug 3rd):

ls -lt
total 20K
drwxr-xr-x 4 root root 4.0K May 1 13:34 ..
-rwx------ 1 vcap vcap  16 Jun 28 20:09 POSTGRES_DATA_VERSION
drwx------ 19 vcap vcap 4.0K Aug 3 05:26 postgres-9.6.10
drwx------ 4 vcap vcap 4.0K Aug 3 05:28 .
drwx------ 19 vcap vcap 4.0K Oct 8 07:51 postgres-9.6.8
When sure which directory has to be removed:
  • run df -h to verify the used space, take note.
  • remove the directory  with rm -R <directory name>
  • run df-h to confirm the space has been freed. 
Note: If postgres data base schema has been corrupted while the disk was full, you will find PCF Metrics not reporting data in the UI. 

Workaround for this after freeing up space is:
  • Navigate to PCF Metrics space with cf cli (cf target -o system -s metrics-v1-6)
  • Note down all running apps
  • stop all running apps (cf stop <app name>)
  • using bosh cli recreate the postgres vm (bosh -e <environment> -d <metrics deployment> recreate <postgres vm>)
  • start all apps that were running
  • Issue should be completely resolved.