We recently upgraded our old 21.6 Kafka Agent to the latest version, 24.10.
search cancel

We recently upgraded our old 21.6 Kafka Agent to the latest version, 24.10.

book

Article ID: 386543

calendar_today

Updated On:

Products

DX SaaS

Issue/Introduction

We have observed numerous issues and instability since the upgrade to 24.10.

1) In the 21.6 version of the Agent, we used the following setting to capture JMX metrics from our various Kafka instances:

introscope.agent.kafka.jmx.include.filter=confluent-authorizer-metrics:*;kafka.admin.client:*;kafka.cluster:*;kafka.consumer:*;kafka.controller:*;kafka.coordinator.group:*;kafka.coordinator.transaction:*;kafka.databalancer:*;kafka.log:*;kafka.network:*;kafka.producer:*;kafka.rest:*;kafka.security:*;kafka.server:*;kafka.tier:*;kafka.tier.tasks:*;kafka.tier.tasks.archive:*;kafka.tier.tasks.delete:*;kafka.utils:*;java.lang:type=GarbageCollector,name=G1 Young Generation;java.lang:type=GarbageCollector,name=G1 Old Generation;java.lang:type=MemoryPool,name=G1 Old Gen;java.lang:type=MemoryPool,name=G1 Eden Space;java.lang:type=OperatingSystem;java.lang:type=Threading;org.apache.ZooKeeperService:*

When trying to replicate this same JMX filter under the 24.10 version, the Agent JVM quickly runs out of heap (despite increasing the heap size to 2.5 GB).

2) To attain some stability based on the issue in point #1, reduced the filter to the following:

introscope.agent.kafka.broker.broker1.jmx.include.filter=kafka.server:*;

with the following filter:

introscope.agent.kafka.broker.broker1.jmx.exclude.filter=kafka.server:type=Request,client-id=*;kafka.cluster:type=Partition,name=CaughtUpReplicasCount,*;kafka.cluster:type=Partition,name=DeferredUnderMinIsr,*;kafka.cluster:type=Partition,name=BlockedOnMirrorSource,*;kafka.cluster:type=Partition,name=LastStableOffsetLag,*;kafka.cluster:type=Partition,name=MirrorReplicasCount,*;kafka.cluster:type=Partition,name=UnderMinIsrMirror,*;kafka.cluster:type=Partition,name=UnderReplicatedMirror,*;

This reduced the heap memory consumption to a reasonable rate, however, metrics are extremely sporadic.

Resolution

Set introscope.agent.remotejmx.softConfigSync.interval.seconds=0 and turn off debug logging