Installing Kafka with Strimzi
The setup provided here is meant only for development purposes. Using a managed Kafka service is a sensible choice if you are not already running Kafka in production. |
This guide shows how to install Strimzi and create a Kafka cluster suitable for use with Cloudflow.
Strimzi is a Kubernetes operator that simplifies the process of running Apache Kafka.
Installing Strimzi
In this guide, we will use Helm to install Strimzi.
We are going to create Strimzi in the cloudflow
namespace in this guide. Make sure that the namespace cloudflow
exists before continuing. To create the cloudflow
namespace, execute
kubectl create ns cloudflow
Add the Strimzi Helm repository and update the local index.
helm repo add strimzi https://strimzi.io/charts/ helm repo update
Install the latest version of Strimzi.
helm install strimzi strimzi/strimzi-kafka-operator --namespace cloudflow
After the install complete, the Strimzi Kafka operator should be running in the cloudflow
namespace.
$ kubectl get pods -n cloudflow NAME READY STATUS RESTARTS AGE strimzi-cluster-operator-9968fd8c9-fhqmj 1/1 Running 0 17s
Creating a Kafka cluster using Strimzi
To create a Kafka cluster using Strimzi, we have to create a CustomResource
of the kind Kafka
in the namespace cloudflow
.
The Kafka cluster configuration shown here is meant for development and testing purposes only. Please consider your storage requirements and modify the custom resource accordingly for your intended usage. |
You can use the following command to create a cluster:
kubectl apply -f - <<EOF
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: cloudflow-strimzi
namespace: cloudflow
spec:
kafka:
config:
auto.create.topics.enable: false
log.message.format.version: "2.3"
log.retention.bytes: 1073741824
log.retention.hours: 1
log.retention.check.interval.ms: 300000
offsets.topic.replication.factor: 3
transaction.state.log.min.isr: 2
transaction.state.log.replication.factor: 3
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
livenessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
readinessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
replicas: 3
resources: {}
storage:
deleteClaim: false
size: 100Gi
type: persistent-claim
version: 2.7.1
zookeeper:
livenessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
readinessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
replicas: 3
resources: {}
storage:
deleteClaim: false
size: 10Gi
type: persistent-claim
EOF
When the Strimzi Kafka operator has processed the custom resource, there should be six new pods in the cloudflow
namespace. Three Kafka broker pods and three Zookeeper pods.
$ kubectl get pods -n cloudflow NAME READY STATUS RESTARTS AGE cloudflow-strimzi-kafka-0 2/2 Running 0 1m cloudflow-strimzi-kafka-1 2/2 Running 0 1m cloudflow-strimzi-kafka-2 2/2 Running 0 1m cloudflow-strimzi-zookeeper-0 1/1 Running 0 2m cloudflow-strimzi-zookeeper-1 1/1 Running 0 2m cloudflow-strimzi-zookeeper-2 1/1 Running 0 2m strimzi-cluster-operator-9968fd8c9-fhqmj 1/1 Running 0 5m
If you want to change any parameters of the Kafka cluster, edit the kafka-cluster.yaml file and apply it again to the cluster, as shown above. The Strimzi Kafka cluster operator will make the necessary changes when detecting an update of the custom resource.
|
The most important parameters here are:
-
Log retention policy - see this article and this one for log retention explanation. In our example we are using 1 hour retention, which is good enough for testing, with retention time check at 300000 ms. Alternatively you can use
log.retention.bytes
- a size-based retention policy (in bytes). -
Kafka storage size - you need to make sure that this value is greater than the max log retention size in bytes, or calculated log size based on the log retention time policy and anticipated message rate.