/var/log/arkin/hadoop-yarn/containers/application_#############_0004/container_#############_0004_01_000002/taskmanager.log below errors and expectations are seenYYYY-MM-DDT14:14:13.700Z INFO kafka.clients.FetchSessionHandler Source Data Fetcher for Source: SDMProcessSRC -> GenSDM -> Filter -> MetStoreMap -> (Sink: RAW_METRIC_SINK, Sink: FlinkKafkaProducer, async wait operator -> Timestamps/Watermarks -> Flat Map, Filter -> Map) (5/8)_1 handleError:445 [Consumer clientId=vrniflink-4, groupId=vrniflink] Error sending fetch request (sessionId=1450162592, epoch=1) to node 0: {}.
org.apache.kafka.common.errors.DisconnectException: nullFull error stack message as below:
YYYY-MM-DDT14:19:58.401Z ERROR runtime.taskexecutor.TaskExecutor flink-akka.actor.default-dispatcher-17 onFatalError:2112 Fatal error occurred in TaskExecutor akka.tcp://flink@localhost:39797/user/rpc/taskmanager_0.
org.apache.flink.util.FlinkException: The TaskExecutor's registration at the ResourceManager akka.tcp://flink@localhost:34411/user/rpc/resourcemanager__ has been rejected: Rejected TaskExecutor registration at the ResourceManager because: The ResourceManager does not recognize this TaskExecutor.
at org.apache.flink.runtime.taskexecutor.TaskExecutor_ResourceManagerRegistrationListener.onRegistrationRejection(TaskExecutor.java:2293) _[flink-dist_2.12-1.14.6.jar:1.14.6]
at org.apache.flink.runtime.taskexecutor.TaskExecutor_ResourceManagerRegistrationListener.onRegistrationRejection(TaskExecutor.java:2248) _[flink-dist_2.12-1.14.6.jar:1.14.6]
at org.apache.flink.runtime.taskexecutor.TaskExecutorToResourceManagerConnection.onRegistrationRejection(TaskExecutorToResourceManagerConnection.java:109) _[flink-dist_2.12-1.14.6.jar:1.14.6]
at org.apache.flink.runtime.taskexecutor.TaskExecutorToResourceManagerConnection.onRegistrationRejection(TaskExecutorToResourceManagerConnection.java:40) _[flink-dist_2.12-1.14.6.jar:1.14.6]
at org.apache.flink.runtime.registration.RegisteredRpcConnection.lambda_createNewRegistration_0(RegisteredRpcConnection.java:269) _[flink-dist_2.12-1.14.6.jar:1.14.6]
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) _[_:_]
at java.util.concurrent.CompletableFuture_UniWhenComplete.tryFire(CompletableFuture.java:841) _[_:_]
at java.util.concurrent.CompletableFuture_Completion.run(CompletableFuture.java:482) _[_:_]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda_handleRunAsync_4(AkkaRpcActor.java:455) _[flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) _[flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:455) _[flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:213) _[flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163) _[flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at scala.PartialFunction.applyOrElse_(PartialFunction.scala:122) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at scala.PartialFunction_OrElse.applyOrElse(PartialFunction.scala:171) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at scala.PartialFunction_OrElse.applyOrElse(PartialFunction.scala:172) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at scala.PartialFunction_OrElse.applyOrElse(PartialFunction.scala:172) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.actor.Actor.aroundReceive(Actor.scala:537) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.actor.Actor.aroundReceive_(Actor.scala:535) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.actor.ActorCell.invoke(ActorCell.scala:548) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.dispatch.Mailbox.run(Mailbox.scala:231) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at akka.dispatch.Mailbox.exec(Mailbox.scala:243) [flink-rpc-akka_2bd0469e-3e4b-45ed-8fb4-ab5b8ca56b8d.jar:1.14.6]
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373) [_:_]
at java.util.concurrent.ForkJoinPool_WorkQueue.topLevelExec(ForkJoinPool.java:1182) [_:_]
at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655) [_:_]
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622) [_:_]
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165) [_:_]
YYYY-MM-DDT14:19:58.408Z ERROR runtime.taskexecutor.TaskManagerRunner flink-akka.actor.default-dispatcher-17 onFatalError:330 Fatal error occurred while executing the TaskManager. Shutting it down...
org.apache.flink.util.FlinkException: The TaskExecutor's registration at the ResourceManager akka.tcp://flink@localhost:34411/user/rpc/resourcemanager__ has been rejected: Rejected TaskExecutor registration at the ResourceManager because: The ResourceManager does not recognize this TaskExecutor.The case of this is unknown, however from above errors and expectations some issue with Kafka hence restart of service should fix this issue and lags are expected to settle down gradually.
To resolve this issue , perform below:
ub
sudo systemctl stop kafka.service
sudo systemctl stop zookeeper-server.service sudo systemctl start kafka.service
sudo systemctl start zookeeper-server.service
./check-service-health.sh -p -dNote: All services should be running and healthy when above command is executed.