How to install kafka on RHEL 8

Apache Kafka is a distributed streaming platform. With it’s rich API (Application Programming Interface) set, we can connect mostly anything to Kafka as source of data, and on the other end, we can set up a large number of consumers that will receive the steam of records for processing. Kafka is highly scaleable, and stores the streams of data in a reliable and fault-tolerant way. From the connectivity perspective, Kafka can serve as a bridge between many heterogeneous systems, which in turn can rely on it’s capabilities to transfer and persist the data provided.

In this tutorial we will install Apache Kafka on a Red Hat Enterprise Linux 8, create the systemd unit files for ease of management, and test the functionality with the shipped command line tools.

In this tutorial you will learn:

How to install Apache Kafka
How to create systemd services for Kafka and Zookeeper
How to test Kafka with command line clients

Consuming messages on Kafka topic from the command line.

Software Requirements and Conventions Used

Software Requirements and Linux Command Line Conventions
Category	Requirements, Conventions or Software Version Used
System	Red Hat Enterprise Linux 8
Software	Apache Kafka 2.11
Other	Privileged access to your Linux system as root or via the `sudo` command.
Conventions	# – requires given linux commands to be executed with root privileges either directly as a root user or by use of `sudo` command $ – requires given linux commands to be executed as a regular non-privileged user

How to install kafka on Redhat 8 step by step instructions

Apache Kafka is written in Java, so all we need is OpenJDK 8 installed to proceed with the installation. Kafka relies on Apache Zookeeper, a distributed coordination service, that is also written in Java, and is shipped with the package we will download. While installing HA (High Availability) services to a single node does kill their purpose, we’ll install and run Zookeeper for Kafka’s sake.

To download Kafka from the closest mirror, we need to consult the official download site. We can copy the URL of the .tar.gz file from there. We’ll use wget, and the URL pasted to download the package to the target machine:
```
# wget https://www-eu.apache.org/dist/kafka/2.1.0/kafka_2.11-2.1.0.tgz -O /opt/kafka_2.11-2.1.0.tgz
```
We enter the /opt directory, and extract the archive:
```
# cd /opt
# tar -xvf kafka_2.11-2.1.0.tgz
```
And create a symlink called /opt/kafka that points to the now created /opt/kafka_2_11-2.1.0 directory to make our lives easier.
```
ln -s /opt/kafka_2.11-2.1.0 /opt/kafka
```
We create a non-privileged user that will run both zookeeper and kafka service.
```
# useradd kafka
```
And set the new user as owner of the whole directory we extracted, recursively:
```
# chown -R kafka:kafka /opt/kafka*
```

We create the unit file /etc/systemd/system/zookeeper.service with the following content:

[Unit]
Description=zookeeper
After=syslog.target network.target

[Service]
Type=simple

User=kafka
Group=kafka

ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh

[Install]
WantedBy=multi-user.target

Note that we don’t need to write the version number three times because of the symlink we created. The same applies to the next unit file for Kafka, /etc/systemd/system/kafka.service, that contains the following lines of configuration:

[Unit]
Description=Apache Kafka
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple

User=kafka
Group=kafka

ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh

[Install]
WantedBy=multi-user.target

We need to reload systemd to get it read the new unit files:
```
# systemctl daemon-reload
```

Now we can start our new services (in this order):

# systemctl start zookeeper
# systemctl start kafka

If all goes well, systemd should report running state on both service’s status, similar to the outputs below:

# systemctl status zookeeper.service
  zookeeper.service - zookeeper
   Loaded: loaded (/etc/systemd/system/zookeeper.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-01-10 20:44:37 CET; 6s ago
 Main PID: 11628 (java)
    Tasks: 23 (limit: 12544)
   Memory: 57.0M
   CGroup: /system.slice/zookeeper.service
            11628 java -Xmx512M -Xms512M -server [...]

# systemctl status kafka.service
  kafka.service - Apache Kafka
   Loaded: loaded (/etc/systemd/system/kafka.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-01-10 20:45:11 CET; 11s ago
 Main PID: 11949 (java)
    Tasks: 64 (limit: 12544)
   Memory: 322.2M
   CGroup: /system.slice/kafka.service
            11949 java -Xmx1G -Xms1G -server [...]

Optionally we can enable automatic start on boot for both services:

# systemctl enable zookeeper.service
# systemctl enable kafka.service

To test functionality, we’ll connect to Kafka with one producer and one consumer client. The messages provided by the producer should appear on the console of the consumer. But before this we need a medium these two exchange messages on. We create a new channel of data called topic in Kafka’s terms, where the provider will publish, and where the consumer will subscribe to. We’ll call the topic
FirstKafkaTopic. We’ll use the kafka user to create the topic:
```
$ /opt/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic FirstKafkaTopic
```

We start a consumer client from the command line that will subscribe to the (at this point empty) topic created in the previous step:
```
$ /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic FirstKafkaTopic --from-beginning
```
We leave the console and the client running in it open. This console is where we will receive the message we publish with the producer client.
On another terminal, we start a producer client, and publish some messages to the topic we created. We can query Kafka for available topics:
```
$ /opt/kafka/bin/kafka-topics.sh --list --zookeeper localhost:2181
FirstKafkaTopic
```
And connect to the one the consumer is subscribed, then send a message:
```
$ /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic FirstKafkaTopic
> new message published by producer from console #2
```
At the consumer terminal, the message should appear shortly:
```
$ /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic FirstKafkaTopic --from-beginning
 new message published by producer from console #2
```
If the message appears, our test is successful, and our Kafka installation is working as intended. Many clients could provide and consume one or more topic records the same way, even with a single node setup we created in this tutorial.

Software Requirements and Conventions Used

How to install kafka on Redhat 8 step by step instructions

Related Linux Tutorials: