Usage Service sometimes faces data reliability and corruption leading to need to Purge & Reseed the usage database. This KB explains how to monitor usage-service-scheduler metrics in order to prevent any corruption of usage data.
The usage-service-scheduler application running on the system has a worker_check_in. The endpoint is GET /heartbeat/workers
The service calculates a "Doomsday" metric, which predicts when a "Purge and Reseed" will become mandatory due to data expiration.
Examples:
Worker health:
% curl -k https://app-usage.<SYSTEM DOMAIN>/heartbeat/workers
ok%
Doomsday counter:
% curl -k https://app-usage.<SYSTEM DOMAIN>/heartbeat/doomsday
{"days_left":31,"source_details":{"failed_ingestion_app_days_left":31,"failed_ingestion_service_days_left":31}}%