documentation.txt

#### This documentation is made by Hassan Hashmy

Multi-Node Cluster

To scale a container using Docker Compose is as simple as using the scale command:

docker-compose scale kafka=3
This will create 2 more containers:

$ docker-compose scale kafka=3
Creating and starting kafkacluster_kafka_2 ... done
Creating and starting kafkacluster_kafka_3 ... done

You, as an application developer, only need to know one of the broker IPs, or use the service name to connect to the cluster. 
As the documentation specifies, the client (eg. producer or consumer) will use it only once to get the Kafka broker IPs from the same cluster. 
This means that Kafka scaling will be transparent to your application.

To validate that all brokers are part of the cluster let’s use Zookeeper client to check. From client container:

$ docker-compose exec kafka bash
# bin/zookeeper-shell.sh zookeeper:2181
ls /brokers/ids
[1003, 1002, 1001]
Scaling Topics

In Kafka, Topics are distributed in Partitions. Partitions allows scalability, enabling Topics to fit in several nodes, and parallelism, 
allowing different instances from the same Consumer Group to consume messages in parallel.

Apart from this, Kafka manage how this Partitions are replicated, to achieve high availability. 
In this case, if you have many replicas from one partition, one will be the leader and there will be zero o more followers spread on different nodes.


Expanding topics in your cluster

Expanding topics in your cluster means move topics and partitions once you have more brokers in your cluster, 
because, as show before, your new brokers won’t store any data, once they are created, unless you create new topics.

You can do this 3 steps:

Identify which topics do you want to move.

Generate a candidate reassignment. This could be done automatically, or you can decide how to redistribute your topics.

Execute your reassignment plan.

You can do this following the documentation here: http://kafka.apache.org/documentation/#basic_ops_cluster_expansion

The steps described in the documentation are automated a bit with Ansible:

Inside the playbooks/prepare-reassignment.yml file you have 2 variables:

vars:
  topics:
    - topic1
  broker_list: 1003
This will prepare a recipe to move your topic topic1 to broker with id 1003.

You can paste the JSON file generated into playbooks/reassign-topic-plan.json

{
  "version":1,
  "partitions":[{"topic":"topic1","partition":0,"replicas":[1003]}]
}
And then run this plan with the another playbook: playbooks/execute-reassignment.yml

Confluent Platform images

All these could be done in the same way with Confluent Platform.

There is a couple of directories confluent-cluster and confluent-client to test this out: