API Developer Portal 4.4 after rebooting the system the customer found the Metrics for API usage is not showing in the API portal analytics from the time the reboot was done.
The Docker journal log shows
portal_coordinator.heulwy6io17vwij501ebwo5id.i72mux5oezrfmrv4gwpdfwp55 2020-09-03 03:45:07 UTC 2020-09-03T03:45:07,629 ERROR [KafkaSupervisor-apim_metrics_hour] org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor - SeekableStreamSupervisor[apim_metrics_hour] failed to handle notice: {class=org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor, exceptionType=class org.apache.druid.java.util.common.ISE, exceptionMessage=Previous sequenceNumber [343522] is no longer available for partition [0]. You can clear the previous sequenceNumber and start reading from a valid message by using the supervisor's reset API., noticeClass=RunNotice}
Release : 4.4
Component : API PORTAL
This error could occur after a reboot of Druid where not all containers are stopped in the right sequence.
To resolve this error check the portal coordinator container for the following error
portal_coordinator.heulwy6io17vwij501ebwo5id.i72mux5oezrfmrv4gwpdfwp55 2020-09-03 03:45:07 UTC 2020-09-03T03:45:07,629 ERROR [KafkaSupervisor-apim_metrics_hour] org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor - SeekableStreamSupervisor[apim_metrics_hour] failed to handle notice: {class=org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor, exceptionType=class org.apache.druid.java.util.common.ISE, exceptionMessage=Previous sequenceNumber [343522] is no longer available for partition [0]. You can clear the previous sequenceNumber and start reading from a valid message by using the supervisor's reset API., noticeClass=RunNotice}
or
portal_coordinator.heulwy6io17vwij501ebwo5id.i72mux5oezrfmrv4gwpdfwp55 2020-09-03 03:45:07 UTC 2020-09-03T03:45:07,629 ERROR [KafkaSupervisor-apim_metrics org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor - SeekableStreamSupervisor[apim_metrics_hour] failed to handle notice: {class=org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor, exceptionType=class org.apache.druid.java.util.common.ISE, exceptionMessage=Previous sequenceNumber [343522] is no longer available for partition [0]. You can clear the previous sequenceNumber and start reading from a valid message by using the supervisor's reset API., noticeClass=RunNotice}
There are two supervisor task which could stop processing metrics "apim_metrics_hour"and "apim_metrics"
Solution :
start a shell into the portal portal_coordinator container :
docker exec -it $(docker ps --filter name=portal_coordinator -q) /bin/sh
Run the following command to get the current supervisor running
$ curl -X GET http://localhost:8081/druid/indexer/v1/supervisor
["apim_metrics_hour","apim_metrics"]
run the following curl command to get the current status for the metrics indexer
curl -X GET -H 'Content-Type:application/json' http://localhost:8081/druid/indexer/v1/supervisor/apim_metrics_hour/status
{"id":"apim_metrics_hour","generationTime":"2020-09-04T12:46:04.039Z","payload":{"dataSource":"apim_metrics_hour","stream":"apim_metrics","partitions":1,"replicas":1,"durationSeconds":3600,"activeTasks":[{"id":"index_kafka_apim_metrics_hour_f9c8b2384a92f14_ehgflkec","startingOffsets":{"0":672},"startTime":"2020-09-04T12:36:12.123Z","remainingSeconds":3008,"type":"ACTIVE","currentOffsets":{"0":672},"lag":{"0":0}}],"publishingTasks":[],"latestOffsets":{"0":672},"minimumLag":{"0":0},"aggregateLag":0,"offsetsLastUpdated":"2020-09-4T12:45:49.406Z","suspended":false,"healthy":true,"state":"RUNNING","detailedState":"RUNNING","c
Run the following command to reset the supervisor for "apim_metrics_hour"
curl -X POST http://localhost:8081/druid/indexer/v1/supervisor/apim_metrics_hour/reset
Run the following command to reset the supervisor for "apim_metrics"
curl -X POST http://localhost:8081/druid/indexer/v1/supervisor/apim_metrics/reset
Close the shell
Verify the portal_coordinator log file and check if the above error does not occur anymore
Verify the analytics dashboard, new Api data should be available again .