Scaling Apache Kafka with Strimzi and KEDA
Apache Kafka has become the backbone of many modern distributed systems, providing a robust and scalable way to stream data. As organizations begin to embrace Kubernetes for container orchestration, the need to seamlessly integrate Kafka into this dynamic environment becomes very important. Most organizations have a goal to create scalable Kafka deployments, we can use Strimzi to accomplish that task. Strimzi is an open-source tool that helps manage and maintain Kafka clusters. Strimzi offers several operators including:
Zookeeper
Kafka Connect
Kafka MirrorMaker
Kafka Exporter
This blog is not just about deployments by the way, it’s about leveraging the capabilities of Strimzi to orchestrate Kafka clusters. While we’re jumping into how to implement Strimzi with Kafka on Kubernetes, we will also go over how to effectively autoscale the deployments with KEDA (Kubernetes-based Event-Driven Autoscaling. KEDA will be an essential component that helps us achieve our goal of scalability, for our Kafka deployments.
Installing the Strimzi Kafka Operator on Kubernetes
To begin, let’s start our local Minikube cluster with the following command:
$ minikube start --memory=4096
Now that our Minikube cluster is up and running, let’s create a dedicated namespace, where we’ll orchestrate our Kafka deployment:
$ kubectl create namespace kafka
Step 1: Set Strimzi Version
$ export STRIMZI_VERSION=0.38.0
Step 2: Deploy the Strimzi Kafka Operator:
$ curl -L https://github.com/strimzi/strimzi-kafka-operator/releases/download/${STRIMZI_VERSION}/strimzi-cluster-operator-${STRIMZI_VERSION}.yaml \
| sed 's/namespace: .*/namespace: kafka/' \
| kubectl apply -f - -n kafka
Once that curl command is executed successfully & applied to our kafka namespace, we will be able to use Strimzi. Let’s briefly summarize what was created:
Strimzi Operator Setup:
RBAC & clusterroles created for security and confidentiality
A ConfigMap & serviceaccount for the strimzi-cluster-operator is also created
Security:
We created secure roles and permissions to control everything for the Kafka broker
We are also safeguarding sensitive client data and interactions
Connectivity:
Strimzi incorporates a feature that enables the seamless creation and management of connections between various Kafka components through the implementation of bridges
Deployment:
We deployed and manage the Strimzi Kafka Operator. We also streamlined Kubernetes interactions, including client delegation & ensuring smooth orchestration of Kafka clusters.
Now that we got our Strimzi Operator up and running in the kafka namespace, let’s check the health of our minikube cluster:
$ kubectl get all -n kafka
Our output should resemble something like this, showcasing the readiness and status of the Strimzi Operator deployment in the kafka namespace:
NAME READY STATUS RESTARTS AGE
pod/strimzi-cluster-operator-95d88f6b5-ww5pv 1/1 Running 0 3m2s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/strimzi-cluster-operator 1/1 1 1 3m2s
NAME DESIRED CURRENT READY AGE
replicaset.apps/strimzi-cluster-operator-95d88f6b5 1 1 1 3m2s
In the next step, we’re going to deploy a Strimzi Kafka cluster with a single broker and Zookeeper node. This is going to include persistent storage configuration, which will help enable all of the Kafka operations and data durability.
$ kubectl apply -f https://raw.githubusercontent.com/strimzi/strimzi-kafka-operator/${STRIMZI_VERSION}/examples/kafka/kafka-persistent-single.yaml -n kafka
Let’s assess the status of our Kafka cluster:
$ kubectl get kafka -n kafka
The output should give us information about our cluster, such as the Kafka replica and the Zookeeper replica (indicating that the cluster is currently in a ready state):
NAME DESIRED KAFKA REPLICAS DESIRED ZK REPLICAS READY WARNINGS
my-cluster 1