How to build / deploy the kafka-connect-geode connector with a Kafka cluster
search cancel

How to build / deploy the kafka-connect-geode connector with a Kafka cluster

book

Article ID: 294013

calendar_today

Updated On:

Products

VMware Tanzu Gemfire

Issue/Introduction

This KB is a quick-start guide which will explain how to build / deploy the kafka-connect-geode connector with the Apache Geode / GemFire cluster and the Kafka cluster. This KB will also explain how to build and deploy the connector when using the Apache Geode / Gemfire cluster as a source or a sink.

Resolution

How to build the kafka-connect-geode connector


Step 1:

Checkout the git repository of geode-kafka-connector:
Step 2:

Change to the geode-kafka-connector-master local git repository folder and run "mvn package" to build the package of the kafka-connect-geode connector. After the package build completes successfully, you can find the kafka-connect-geode-XX-SNAPSHOT.jar or the kafka-connect-geode-XX-SNAPSHOT-package folder in the local repository's target folder.

Note: You can also change the versions of any geode / kafka dependencies as needed from this project's pom.xml. After modifying any code, you may need to rebuild the package using the command "mvn spotless:apply package -DskipTests".

For example:
    <properties>
        <geode.core.version>1.12.1</geode.core.version>
        <geode.cq.version>1.12.1</geode.cq.version>
        <geode.dunit.version>1.12.1</geode.dunit.version>
        <kafka.connect-api.version>2.7.0</kafka.connect-api.version>
        <log4j.version>2.13.1</log4j.version>
        <kafka_2.13.version>2.7.0</kafka_2.13.version>
        <curator-framework.version>4.2.0</curator-framework.version>
        <kafka-streams-test-utils.version>2.7.0</kafka-streams-test-utils.version>
        <kafka.connect-runtime.version>2.7.0</kafka.connect-runtime.version>
        <junit.version>4.12</junit.version>
        <mockito.version>3.2.4</mockito.version>
        <JUnitParams.version>1.1.1</JUnitParams.version>
        <awaitility.version>3.1.6</awaitility.version>
        <maven-plugin.version>3.8.1</maven-plugin.version>
        <zookeeper.version>3.5.7</zookeeper.version>
        <spotless.version>1.27.0</spotless.version>
        <rat.version>0.13</rat.version>
      <confluent.maven.repo>http://packages.confluent.io/maven/</confluent.maven.repo>
    </properties>
 

How to deploy the geode-kafka-connector connector

Step 1:

Following Kafka's official guide, set up a Kafka cluster and make sure that the Kafka cluster is up and running. Create a Kafka topic for kafka-connect-geode connector.

For example:
# Start the ZooKeeper service
$ bin/zookeeper-server-start.sh config/zookeeper.properties

# Start the Kafka broker service
$ bin/kafka-server-start.sh config/server.properties

# Create a topic
$ bin/kafka-topics.sh --create --topic geode --bootstrap-server localhost:9092

Step 2:

Follow the Apache Geode / Gemfire guide to create an Apache Geode / Gemfire cluster. Then create a region (the default Region name is gkcRegion) to be used when we are using Apache Geode / Gemfire as a source and create a region (the default name is gkcSinkRegion) to be used when we are using Apache Geode / Gemfire as a sink.

For example:
gfsh> create region --name=sourceRegion --type=PARTITION
gfsh> create region --name=sinkRegion --type=PARTITION

Step 3:

Modify the [KAFKA_INSTALLATION_HOME]/config/connect-standalone.properties (standalone mode) and point to the location of the kafka-connect-geode connector jar.

For example:
plugin.path=/Users/user1/Downloads/geode-kafka-connector-master/target/kafka-connect-geode-1.0-SNAPSHOT-package/share/java/kafka-connect-geode

Step 4:

Create [KAFKA_INSTALLATION_HOME]/config/connect-geode-sink.properties file when using Apache Geode / Gemfire cluster as a sink to poll data from the Kafka cluster to the Apache Geode / Gemfire
cluster.

For example:
name=geode-sink
connector.class=org.apache.geode.kafka.sink.GeodeKafkaSink
tasks.max=1
topic-to-regions=[geode:sinkRegion]
topics=geode
locators=192.168.0.10[10334]

Step 5:

Create a [KAFKA_INSTALLATION_HOME]/config/connect-geode-source.properties file to use the Apache Geode / Gemfire cluster as a source to pull data from the Apache Geode / Gemfire cluster to the Kafka cluster.

For example:
name=geode-source
connector.class=org.apache.geode.kafka.source.GeodeKafkaSource
tasks.max=1
region-to-topics=[sourceRegion:geode]
locators=192.168.0.10[10334]
cqsToRegister="select * from /sourceRegion limit 100"
load-entire-region=true
cq-prefix = cqForGeodeKafka

Step 6:

Run the kafka-connect-geode connector (standalone mode) by using connect-standalone.properties or connect-geode-sink.properties accordingly.

For example:
bin/connect-standalone.sh config/connect-standalone.properties config/connect-geode-source.properties config/connect-geode-sink.properties

Step 7:

Verify that the geode-kafka-connector connector works:

1. Put data into the source region using the Apache Geode / Gemfire cluster as a source.

For example:
gfsh> put --region=/sourceRegion --key="key1" --value="value1"

2. Confirm the data from sink region using the Apache Geode / Gemfire cluster as a sink.

For example:
gfsh> query --query="select * from /sinkRegion"