ADE not reading from all Kafka partitions on Kafka PVC recreation will have all of the following symptoms:
WatchTower 1.2
When all the services of WT are up and running and without bringing down the WT services if the Kafka's PVC is deleted and recreated then the ADE service does not automatically consume from all of the partitions on its input topic. Kafka's PVC is the source of message offsets, partitions of a topic, etc for the consumers of Kafka topics. One of these consumers is ADE and it needs this information to function as expected. Deleting the PVC makes the consumers lose this information. This failure mode can only occur if someone deletes the Kafka PVC.
Kafka PVC Deletion
Deletion of the Kafka PVC is a non-standard operation and should never be performed by the customer. Its deletion will result in lost data.
Determination:
This failure mode can be determined by examining the creation timestamp of the Kafka PVC and comparing it to the creation timestamps of the Ingestor and ADE services. If the Kafka PVC creation time is newer than the creation time of the services then the system is definitely in this failure mode.
Kafka PVC Timestamp
kubectl -n "${NAMESPACE}" get pvc common-service-kafka-pvc-kafka-0 -o jsonpath="{.metadata.creationTimestamp}"
Pod Ready Timestamp
kubectl -n "${NAMESPACE}" get pod data-insights-ingestor-... -o jsonpath="{range .status.conditions[*]}{.type}
{','}{.lastTransitionTime}{'\n'}{end}"
kubectl -n "${NAMESPACE}" get pod ml-insights-profiler-ade-0 -o jsonpath="{range .status.conditions[*]}{.type}
{','}{.lastTransitionTime}{'\n'}{end}"
Resolution:
If this failure mode is encountered then perform the following steps to fix the issue
1. Scale down both the Ingestor and ADE services
Scale down services
kubectl scale deployment data-insights-ingestor --replicas=0
kubectl scale statefulset ml-insights-profiler-ade --replicas=0
2. Scale up Ingestor service and wait for it to be in the Ready state
Scale up Ingestor service
kubectl scale deployment data-insights-ingestor --replicas=1
# Wait until it is READY
kubectl get deployment data-insights-ingestor
NAME READY UP-TO-DATE AVAILABLE AGE
data-insights-ingestor 1/1 1 1 2m
3. Scale up the ADE service
Scale up ADE service
kubectl scale statefulset ml-insights-profiler-ade --replicas=1