API Developer Portal: Purging data from Druid analytics database

book

Article ID: 206488

calendar_today

Updated On:

Products

CA API Developer Portal

Issue/Introduction

This article will discuss how to purge the analytics data from the Druid database in order to save disk space, as it can often lead to excessive disk space usage causing various concerns in the API Portal.

Cause

The disk space for the /var/lib/docker/overlay partition on the Docker portal has grown close to 100%. A review of the biggest files in this partition, by using the following command, shows the Druid database taking most of the space. This is caused by storing too much data in Analytics.

du -a /var/lib/docker/overlay | sort -n -r | head -n 20

138831144       /var/lib/docker/overlay
129477500       /var/lib/docker/overlay/xxxxxx
64883196        /var/lib/docker/overlay/xxxxxx/merged
64594524        /var/lib/docker/overlay/xxxxxx/merged/var
64594452        /var/lib/docker/overlay/xxxxxx/merged/var/druid
64594288        /var/lib/docker/overlay/xxxxxx/upper
64593772        /var/lib/docker/overlay/xxxxxx/upper/var
64593764        /var/lib/docker/overlay/xxxxxx/upper/var/druid
60825840        /var/lib/docker/overlay/xxxxxx/merged/var/druid/indexing-logs
60825008        /var/lib/docker/overlay/xxxxxx/upper/var/druid/indexing-logs

Environment

This affects all API Developer Portal versions with Druid as the Analytics engine.

Resolution

The Portal analytics data is stored for 731 days by default. You can change this setting by running the following commands. Please note if you change the days then the data after those many days will be removed.

  1. Find the coordinator container ID by running this command
    • docker ps | grep coordinator
    • Example: [[email protected] ~]$ docker ps | grep coordinator
      5ce48dc8932f        apim-portal.packages.ca.com/apim-portal/druid:4.5                         "/opt/druid-entry.sh"    3 weeks ago         Up 3 weeks (healthy)  
  2. Go into the coordinator container
    • docker exec -it 5ce48dc8932f sh
  3. Run this curl command to find out the current settings
    • curl -X GET 'http://localhost:8081/druid/coordinator/v1/rules/apim_metrics_hour'
    • Example: $ curl -X GET 'http://localhost:8081/druid/coordinator/v1/rules/apim_metrics_hour'
      [{"period":"P731D","includeFuture":false,"tieredReplicants":{"_default_tier":2},"type":"loadByPeriod"},{"type":"dropForever"}]
  4. Run this curl command to set the days to 100
    • curl -X POST 'http://localhost:8081/druid/coordinator/v1/rules/apim_metrics_hour' \
       --header 'Content-Type: application/json' \
       --data '[{
               "period": "P100D",
               "includeFuture": false,
               "tieredReplicants": {
                   "_default_tier": 2
               },
               "type": "loadByPeriod"
           }, {
               "type": "dropForever"
           }
       ]'
  5. Run the curl GET command again and confirm the changes
    • $ curl -X GET 'http://localhost:8081/druid/coordinator/v1/rules/apim_metrics_hour'
      [{"period":"P100D","includeFuture":false,"tieredReplicants":{"_default_tier":2},"type":"loadByPeriod"},{"type":"dropForever"}]

This will keep the 100 days worth of analytics data in the Druid database. You can use any number for the days as it fits your use case.

***The data will be permanently deleted after setting the new value***

Please make sure to try these steps in the non-production server first to observe the results before trying it in the live production server.