Symptoms:
Minio disk gradually fills up with spark data checkpoints from InfraClassifier(IC) that run every hour. When the data-size of the disk grows too large, it can impact the ability of other Intelligence services that also use Minio.
NAPP will also have an alarm about high disk usage for Data Storage, but nothing about Analytics. The Storage usage can be seen in alarms and by reviewing Core Services tab in NAPP
The problem can be identified by running the disk-usage command on any of the minio-* nodes in the nsxi-platform namespace. The iccheckpoints directory grows quite large.
1. Enable napp-k commands:
export KUBECONFIG=/config/vmware/napps/.kube/config
2. Get to the minio-0 pod (as an example). Log into the NSX manager as root user and then issue the following command:
napp-k exec -it minio-0 -- /bin/bash
3. Run the disk-usage command:
du -ah --max-depth=1 /data/minio
20G /data/minio/druid
549M /data/minio/feature-service
22M /data/minio/llanta
79G /data/minio/iccheckpoints <-------------- NOTE: LARGE SIZE 79G!
4.0K /data/minio/events
4.0K /data/minio/icfeatures
514M /data/minio/processing-checkpoints
16K /data/minio/lost+found
12K /data/minio/ntaflow-checkpoints
59M /data/minio/.minio.sys
2.6G /data/minio/data-service
102G /data/minio