Symptoms:
- Data from applications is visible in AXA but appears after several hours
- "Cluster management" reports "AXA Transformer" CPU at 400%
How can we solve the issue?
There is not enough capacity to process all incoming AXA data
How to verify this condition?
1) connect to a kafka pod:
kubectl exec -ti <jarvis-kafka-pod> sh -n<namepsace>
2) Check if there is a LAG in process all incoming AXA data by using the below command:
/opt/ca/kafka/bin/kafka-consumer-groups.sh --bootstrap-server jarvis-kafka:9092,jarvis-kafka-2:9092,jarvis-kafka-3:9092 --describe --group axa.transformer
The below highlighted numbers indicate the delay in data processing, try to run the above command at different times and if the LAG persist then it will confirm the capacity issue.
1) Increase the # of replicas for axa services as below:
kubectl scale --replicas=4 deployment axaservices-transformer -n<namespace>
kubectl scale --replicas=4 deployment axaservices-indexer -n<namespace>
kubectl scale --replicas=4 deployment axaservices-kibana-indexer -n<namespace>
kubectl scale --replicas=4 deployment axaservices-axa-user-processor -n<namespace>
kubectl scale --replicas=2 deployment axaservices-axa-ng-aggregator -n<namespace>
2) Connect to a kafka pod and verify if the LAG issue is resolved
NOTE: You can increase the # of replicas of the above deployments to a maximum of 5, if the problem persists contact Broadcom Support