When the flow rate is too high, or the metadata size within each flow is too large (for example, when there are too many groups), the total size of flows may exceed the threshold used for compaction jobs. As a result, the jobs will not run.
Without compaction, the flow storage usage will grow very fast. This may cause daily or weekly reindexing tasks to fail, eventually leading to faster growth of the flow storage.
Verification:
Login to SSP UI and navigate through:
Select System > Platform & Features > Metrics and scroll to the "Druid Task Failures", you may see failures in Flow Visualization - Index Parallel and Flow Recommendation - Index Parallel.
vDefend SSP 5.0.0
High flow rate and flow size cause compaction and re-indexing jobs to fail.
Increase Compaction Job Parameters:
Increase the compaction job's input threshold, max heap size, and concurrent task count for the correlated_flow_viz table.
# Log into the druid-router pod
k -n nsxi-platform exec -it svc/druid-router -- bash
# Post the updated compaction spec
# Note that the value for inputSegmentSizeBytes has increased from 3000000000 to 4000000000 and maxNumConcurrentSubTasks has increased from 1 to 2
curl 'https://druid-router:8280/druid/coordinator/v1/config/compaction' \
-H 'Content-Type: application/json' \
--data-raw '{"dataSource":"correlated_flow_viz","taskPriority":25,"inputSegmentSizeBytes":4000000000,"maxRowsPerSegment":null,"skipOffsetFromLatest":"PT1H30M","tuningConfig":{"maxRowsInMemory":500000,"appendableIndexSpec":null,"maxBytesInMemory":100000000,"maxTotalRows":null,"splitHintSpec":{"type":"maxSize","maxSplitSize":4294967296,"maxNumFiles":1000},"partitionsSpec":{"type":"dynamic","maxRowsPerSegment":5000000,"maxTotalRows":10000000},"indexSpec":null,"indexSpecForIntermediatePersists":null,"maxPendingPersists":null,"pushTimeout":null,"segmentWriteOutMediumFactory":null,"maxNumConcurrentSubTasks":2,"maxRetry":null,"taskStatusCheckPeriodMs":null,"chatHandlerTimeout":null,"chatHandlerNumRetries":null,"maxNumSegmentsToMerge":null,"totalNumMergeTasks":null,"maxColumnsToMerge":5000,"type":"index_parallel","forceGuaranteedRollup":false},"granularitySpec":null,"dimensionsSpec":null,"metricsSpec":null,"transformSpec":null,"ioConfig":null,"engine":null,"taskContext":{"druid.indexer.fork.property.druid.processing.buffer.sizeBytes":"128000000","druid.indexer.runner.javaOpts":"-Xms128M -Xmx1024M -XX:MaxDirectMemorySize=1G"}}' \
--insecure
# Verify the spec is updated
curl 'https://druid-router:8280/druid/coordinator/v1/config/compaction/correlated_flow_viz' --insecure
Increase Storage Size for Druid Middle Manager:
Ensure the backend storage has enough resources.
Increase the PVC for druid-middle-manager to 32GB.
# Select all druid middle manager pods and all druid middle manager pvcs
k -n nsxi-platform get pods -l app.kubernetes.io/component=druid.middleManager
k -n nsxi-platform get pvc -l component=middle-manager
# For each PVC, increase the size from 16Gi to 32Gi
k -n nsxi-platform patch pvc <data-druid-middle-manager-X> -p '{"spec": {"resources": {"requests": {"storage": "32Gi"}}}}'
# Restart all druid middle manager pods
k -n nsxi-platform delete pod <druid-middle-manager-0> <druid-middle-manager-1> ..
# Check if all pvcs are updated
k -n nsxi-platform get pvc -l component=middle-manager