In this article we will create a Apache Kafka multi broker cluster on three machines, we have three ubuntu machine with 4GB ram.

X.42.153.187
X.26.29.154
X.40.14.204


Before starting, make sure you have Java 7+ running on each instance, check java version with following command:

java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)


If not installed already follow these instructions Here.

Apache Kafka needs a running Zookeeper cluster, we have already created a zookeeper cluster in previous article we will use the same here, zookeeper cluster ip's are:

X.160.225.183:2181,X.161.45.167:2181,X.161.96.241:2181

In order to setup Apache Kafka Multi Broker cluster, following steps need to be followed:
Step 1) Download and unzip Kafka package in all three machines as shown below:

cd /opt/
wget http://mirror.fibergrid.in/apache/kafka/0.10.0.1/kafka_2.11-0.10.0.1.tgz
tar -zxvf kafka_2.11-0.10.0.1.tgz


This will create a untar Kafka package folder with name kafka_2.11-0.10.0.1:

cd
kafka_2.11-0.10.0.1


Step 2) Configure server.properties file in all three machines as whown below:

cd kafka_2.11-0.10.0.1/config/
vim server.properties


Change following old properties:

broker.id=0
zookeeper.connect=localhost:2181

To

broker.id=1 (2,3 in other two machines)
zookeeper.connect=X.160.225.183:2181,X.161.45.167:2181,X.161.96.241:2181


Step 3) Start kafka on each machine one by one

bin/kafka-server-start.sh config/server.properties


Here we are done with setting up cluster for kafka, now let's test with creating a test topic with replication factor 3 and partition 1 on cluster:

bin/kafka-topics.sh --create --zookeeper 35.160.225.183:2181 --replication-factor 3 --partitions 1 --topic test-topic


Now lets check which broker is doing what in the cluster, check status of newly created topic "test" with following command:

bin/kafka-topics.sh --describe --zookeeper 35.160.225.183:2181 --topic test-topic


You will see following output,

Topic:test-topic PartitionCount:1 ReplicationFactor:3 Configs:
Topic: test-topic Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2


Since we have only one partition for this topic there is only one line. Here Partition:0 is partition number, for this only partition Leader is broker 3 for now and Replicas are on all three 3,2,1. ISR indicates all the replicas are in sync, this says all three brokers are in sync for the data in topic "test".

Now lets create another topic with replication factor 3 and partition 3 and see how that is distributed on brokers.

bin/kafka-topics.sh --create --zookeeper 35.160.225.183:2181 --replication-factor 3 --partitions 3 --topic test-topic-2


Now lets test status of this topic, this will look like this:

bin/kafka-topics.sh --describe --zookeeper 35.160.225.183:2181 --topic test-topic-2
Topic:test-topic-2 PartitionCount:3 ReplicationFactor:3 Configs:
Topic: test-topic-2 Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Topic: test-topic-2 Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: test-topic-2 Partition: 2 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1


Here we can see, three lines are representing three different partitions of the topic and related information.

That's it, we are done with creating a multi broker kafka cluster on 3 ubuntu machines. In upcoming articles we will see more about configuration settings, writing publisher and consumers to kafka.
  • By Techburps.com
  • Oct 21, 2016
  • Big Data