AIOps - jarvis-esutils purge job not deleting Elastic indices as per data retention setting
search cancel

AIOps - jarvis-esutils purge job not deleting Elastic indices as per data retention setting

book

Article ID: 248994

calendar_today

Updated On:

Products

DX Operational Intelligence DX Application Performance Management CA App Experience Analytics

Issue/Introduction

You have noticed that the purge job for Elasticsearch is not deleting indices as expected based on the data retention settings.

Here are some of steps you have taken to validate this condition:

1) Checked data retention in the jarvis-esutils deployment, DEFAULT_RETENTION_PERIOD variable

kubectl get deployment jarvis-esutils -n<namespace> -o yaml


2) Found which indices contain old data:

a) list all the indices : http(s)://{es_endpoint}/_cat/indices/?v&s=ss:desc

b) Checked the first and last entry:

http(s)://{es_endpoint}/<index-name>/_search?pretty&sort=@timestamp:desc
http(s)://{es_endpoint}/<index-name>/_search?pretty&sort=@timestamp:asc

for example:
http(s)://{es_endpoint}/ao_axa_session_events_1_284/_search?pretty&sort=@timestamp:desc
http(s)://{es_endpoint}/ao_axa_session_events_1_284/_search?pretty&sort=@timestamp:asc

NOTE: you have used https://www.epochconverter.com/ to covert @timestamp values to human readable timestamp

What is the root cause and how can we fix it?

Environment

Release : 21.3

Component : CA DOI AO PLATFORM COMPONENTS

Cause

ESUtils Jarvis Elasticsearch Utilities purging job failed because of the size of some indices, some were over 1TB and under normal conditions rollover should happen at every 30GB per shard. 
 
 

Resolution

1) Find out the problematic indices by querying ElasticSearch, sort the result by size:

http(s)://{es_endpoint}/_cat/indices/?v&s=ss:desc

NOTE: replace {es_endpoint} with your elastic_endpoint

Here is an output example:

2) Delete the unwanted indices:
 
a) Go to the Openshift or Kubernetes master server
 
b) Run: curl -X DELETE http(s)://{es_endpoint}/<index_name>
 
In this example:

curl -X DELETE http(s)://{es_endpoint}/ao_axa_session_events_1_284
curl -X DELETE http(s)://{es_endpoint}/ao_axa_session_events_1_244

If https, you might need to add --insecure:

curl -X --insecure DELETE http(s)://{es_endpoint}/ao_axa_session_events_1_284
curl -X --insecure DELETE http(s)://{es_endpoint}/ao_axa_session_events_1_244

 

3) Verify that the indices have been deleted:

http(s)://{es_endpoint}/_cat/indices/?v&s=ss:desc

NOTE: replace {es_endpoint} with your elastic_endpoint

4) Verify that "ESUtils Jarvis Elasticsearch Utilities" purge job is working as expected 

- Go to <nfs-dxi-folder>/jarvis/esutils/logs/<jarvis-esutils-current-pod-name>

- Check the purge log:

grep "has been deleted" jarvis-es-utils-Purge.log

Example output:

 

Additional Information

https://knowledge.broadcom.com/external/article/190815/aiops-troubleshooting-common-issues-and.html