Skip to main content

Installing Apache Kafka on Ubuntu/Linux: A Step-by-Step Guide

Introduction:

Apache Kafka is a popular distributed streaming platform used for building real-time data pipelines and streaming applications. This guide will walk you through the process of installing the most stable version of Apache Kafka on Ubuntu/Linux. By following these steps, you will be able to set up and run Kafka on your system efficiently.

Prerequisites:

Ubuntu/Linux system

Java Development Kit (JDK) installed (recommended version 8 or higher)

Step 1: Update System Packages

Before installing Kafka, it's recommended to update the system packages to ensure you have the latest updates and dependencies. Open a terminal and execute the following command:

sudo apt update

Step 2: Install Java Development Kit (JDK)

Kafka requires Java to run. Install OpenJDK on your Ubuntu/Linux system with the following command:

sudo apt install default-jdk

Verify the Java installation by running:

java -version

Step 3: Download and Extract Apache Kafka

Visit the Apache Kafka website to find the latest stable version. Copy the download link for the binary package (e.g., Kafka 2.8.0).

Open a terminal and change to the desired installation directory.

Use the wget command to download the Kafka binary package:

wget <paste_the_download_link_here>

Extract the downloaded package:

tar -xzf kafka_<version>.tgz

Replace <version> with the actual version you downloaded.

Step 4: Configure Apache Kafka

Move into the Kafka directory:

cd kafka_<version>

Modify the config/server.properties file to configure Kafka. You can use a text editor like Nano or Vim to make the necessary changes. Some important configurations to consider include:

advertised.listeners: Set this to the IP address or hostname of the Kafka broker (e.g., advertised.listeners=PLAINTEXT://localhost:9092).

Other configurations like port numbers, data directories, and log settings can be adjusted according to your requirements.

Step 5: Start Apache Kafka

Open a new terminal and navigate to the Kafka installation directory.

Start the ZooKeeper server (required by Kafka) with the following command:

bin/zookeeper-server-start.sh config/zookeeper.properties

Open another terminal, navigate to the Kafka installation directory, and start the Kafka broker:

bin/kafka-server-start.sh config/server.properties

Apache Kafka is now running, and you can interact with it using various Kafka commands or client applications.

Step 6: Create a Topic

Open a new terminal and navigate to the Kafka installation directory if you're not already there.

To create a topic named "my_topic" with a single partition and replication factor, use the following command:

bin/kafka-topics.sh --create --topic my_topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

Adjust the topic name, partition count, and replication factor as needed for your scenario.

Step 7: Publish Your First Message

Open a new terminal and navigate to the Kafka installation directory.

To start a producer and publish a message to the "my_topic" topic, execute the following command:

bin/kafka-console-producer.sh --topic my_topic --bootstrap-server localhost:9092

This command starts the console producer, allowing you to type and send messages.

Start typing your message in the terminal and press Enter to publish it to the topic. You will see the message being displayed on the console.

Congratulations! You have successfully created a topic in Apache Kafka and published your first message. You can now explore Kafka's rich ecosystem and features, including consuming messages, creating consumer applications, and building real-time data processing pipelines. Apache Kafka's distributed architecture and fault-tolerance make it ideal for handling high-throughput streaming data scenarios.

Conclusion:

Congratulations! You have successfully installed Apache Kafka, the most stable version, on your Ubuntu/Linux system. By following this step-by-step guide, you can set up Kafka and start building real-time data streaming applications. Apache Kafka's scalability, fault-tolerance, and high throughput make it a powerful platform for processing and managing streaming data.

Comments