Apache Kafka is a popular distributed message broker designed to efficiently handle large volumes of real-time data. A Kafka cluster is not only highly scalable and fault-tolerant, but it also has a much higher throughput compared to other message brokers such as ActiveMQ and RabbitMQ. Though it is generally used as a publish/subscribe messaging system, a lot of organizations also use it for log aggregation because it offers persistent storage for published messages.
To follow this tutorial, you’ll need:
- An Ubuntu 18.04 server with a non-root user with sudo privileges.
- 4GB or more of RAM
- OpenJDK 8 installed on your server
Step 1 — Creating a User for Kafka
Kafka can handle requests on a network, so you should create a dedicated user for it. This will protect your Ubuntu machine from damages. This step is to create a user for kafka, but you might create another user to perform other tasks.
Create a user called kafka with the useradd command:
sudo useradd kafka -m
-m is to ensures that there will be a home directory for the user.
Set a password for the account with the command:
sudo passwd kafka
Grant sudo permissions to the user with the command:
sudo adduser kafka sudo
Now your kafka user is ready. You can log into the account:
su -l kafka
Kafka has now a specific user, so you can move on to the installation.
Step 2 — Downloading and Extracting the Kafka Binaries
To download and extract Kafka binaries into dedicated folders in the kafka user’s directory, start by creating a directory in
Download kafka binaries using
curl "http://www-eu.apache.org/dist/kafka/1.1.0/kafka_2.12-1.1.0.tgz" -o ~/Downloads/kafka.tgz
Then, create a directory called kafka and make it the base directory for the installation:
mkdir ~/kafka && cd ~/kafka
Extract the file you downloaded with the command:
tar -xvzf ~/Downloads/kafka.tgz --strip 1
The --strip 1 flag is specified to ensure that the file will be extracted in
~/kafka/ and not other directory.
Step 3 — Configuring the Kafka Server
Naturally, Kafka will not allow you to delete a topic, category, feed, or group name to which messages can be published. You can change this by editing the configuration file.
The setup files are specified in server.properties. Open this file with your editor:
Then add a setting that will allow to delete Kafka topics. Paste this in the bottom line:
delete.topic.enable = true
Then save it to have Kafka set.
Step 4 — Creating Systemd Unit Files and Starting the Kafka Server
To perform actions such as starting, stopping, and restarting Kafka in a consistent way with other Linux services. create a systemd unit files for the Kafka server.
Start by creating the unit file for zookeeper (a service that Kafka uses to manage its cluster state and settings):
Then, enter the following unit definition into the file:
[Unit] Requires=network.target remote-fs.target After=network.target remote-fs.target
[Service] Type=simple User=kafka ExecStart=/home/kafka/kafka/bin/zookeeper-server-start.sh /home/kafka/kafka/config/zookeeper.properties ExecStop=/home/kafka/kafka/bin/zookeeper-server-stop.sh Restart=on-abnormal
[Unit] section specifies that Zookeeper needs networking and the filesystem to be ready before it starts. While [Service] section specifies that systemd should use the zookeeper-server-start-sh amd zookeeper-server-stop.sh shell files to start and stop service.
Next step is to create the systemd service file to Kafka:
sudo nano /etc/systemd/system/kafka.service
Enter the following:
[Unit] Requires=zookeeper.service After=zookeeper.service
[Service] Type=simple User=kafka ExecStart=/bin/sh -c '/home/kafka/kafka/bin/kafka-server-start.sh /home/kafka/kafka/config/server.properties > /home/kafka/kafka/kafka.log 2>&1' ExecStop=/home/kafka/kafka/bin/kafka-server-stop.sh Restart=on-abnormal
[Unit] section specifies that this file depends on zookeeper.service. This will ensure that zookeeper starts automatically when kafka starts. While [Service] specifies that systemd should use the kafka-server-start-sh and kafka-server-stop-sh shell files to start and stop the service.
Once the units have been defined, start kafka:
sudo systemctl start kafka
Ensure that the server has started with the command:
sudo journalctl -u kafka
The output should be like this:
Jul 17 18:38:59 kafka-ubuntu systemd: Started kafka.service.
To enable Kafka to reboot with the system, use the command:
sudo systemctl enable kafka
Step 5 — Testing the Installation
To test if Kafka is working, let’s publish a message. Kafka requires two things to publish messages:
A producer, that enables the publication of records A consumer, that will read and consume the data and records
To start, create a topic named TutorialTopic:
~/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic TutorialTopic
Create a producer from the command line using
Publish the message to the topic by entering:
$ echo "Hello, World" | ~/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic TutorialTopic > /dev/null
Then, create kafka’s consumer by typing:
~/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic TutorialTopic --from-beginning
If there’s no problems, you’ll see the message in your terminal.
Step 6 — Restricting the Kafka User
Now that you have installed Kafka, you can remove this user’s privileges. Before you do it, log out and then log in back as any non-root sudo user.
Remove kafka user from the sudo group:
sudo deluser kafka sudo
Lock kafka user’s password with the passwd command, so nobody can log into the server using this account:
sudo passwd kafka -l
Now, only root or sudo users can log in as kafka through the command:
sudo su - kafka
If you want to unlock it, use the -u flag like below:
sudo passwd kafka -u
Now you have Apache Kafka running on your Ubuntu server. If you have questions regarding this tutorial, leave a comment in the session below.