You have noticed that the purge job for Elasticsearch is not deleting indices as expected based on the data retention settings.
Here are some of steps you have taken to validate this condition:
1) Checked data retention in the jarvis-esutils deployment, DEFAULT_RETENTION_PERIOD variable
kubectl get deployment jarvis-esutils -n<namespace> -o yaml
2) Found which indices contain old data:
a) list all the indices : http(s)://{es_endpoint}/_cat/indices/?v&s=ss:desc
b) Checked the first and last entry:
http(s)://{es_endpoint}/<index-name>/_search?pretty&sort=@timestamp:desc
http(s)://{es_endpoint}/<index-name>/_search?pretty&sort=@timestamp:asc
for example:
http(s)://{es_endpoint}/ao_axa_session_events_1_284/_search?pretty&sort=@timestamp:desc
http(s)://{es_endpoint}/ao_axa_session_events_1_284/_search?pretty&sort=@timestamp:asc
NOTE: you have used https://www.epochconverter.com/ to covert @timestamp values to human readable timestamp
What is the root cause and how can we fix it?
Release : 21.3
Component : CA DOI AO PLATFORM COMPONENTS
1) Find out the problematic indices by querying ElasticSearch, sort the result by size:
http(s)://{es_endpoint}/_cat/indices/?v&s=ss:desc
NOTE: replace {es_endpoint} with your elastic_endpoint
Here is an output example:
curl -X DELETE http(s)://{es_endpoint}/ao_axa_session_events_1_284
curl -X DELETE http(s)://{es_endpoint}/ao_axa_session_events_1_244
If https, you might need to add --insecure:
curl -X --insecure DELETE http(s)://{es_endpoint}/ao_axa_session_events_1_284
curl -X --insecure DELETE http(s)://{es_endpoint}/ao_axa_session_events_1_244
3) Verify that the indices have been deleted:
http(s)://{es_endpoint}/_cat/indices/?v&s=ss:desc
NOTE: replace {es_endpoint} with your elastic_endpoint
4) Verify that "ESUtils Jarvis Elasticsearch Utilities" purge job is working as expected
- Go to <nfs-dxi-folder>/jarvis/esutils/logs/<jarvis-esutils-current-pod-name>
- Check the purge log:
grep "has been deleted" jarvis-es-utils-Purge.log
Example output: