March 27, 202610 min read

Apache Kafka: Event-Driven Architecture Without the PhD

Learn Apache Kafka from scratch -- topics, partitions, producers, consumers, and real-world patterns with Python examples and Docker setup.

kafka event-driven architecture python backend

Every backend developer eventually hits a wall. You have Service A that needs to tell Service B something happened. So you make an HTTP call. Then Service C needs to know too. And Service D. Now you have a web of synchronous calls, and when Service C goes down for 30 seconds, everything backs up.

This is where event-driven architecture enters the picture, and Apache Kafka is the tool most teams reach for. It's a distributed event streaming platform -- think of it as a very durable, very fast message highway between your services.

Kafka has a reputation for being complex. And at scale, it absolutely is. But getting started, understanding the core concepts, and building useful things with it? That's more approachable than you'd think.

What Event-Driven Architecture Actually Means

In a traditional request-response architecture, services talk directly to each other. Service A calls Service B's API, waits for a response, and continues. This is simple and works well until it doesn't.

In an event-driven architecture, services communicate by publishing and subscribing to events. Service A publishes an "order-placed" event. Services B, C, and D each independently consume that event and do their own thing. Service A doesn't know or care what happens next.

The benefits:

Decoupling -- Services don't need to know about each other

Resilience -- If Service C is down, the event waits. No data loss.

Scalability -- Add more consumers without changing the producer

Audit trail -- Every event is stored and replayable

Kafka is the backbone that makes this work at scale.

Kafka Core Concepts

Before touching any code, you need to understand five things:

Topics

A topic is a named stream of events. Think of it as a category or channel. user-signups, order-events, payment-transactions -- each is a topic. Producers write to topics. Consumers read from topics.

Partitions

Each topic is split into partitions -- numbered, ordered sequences of events. Partitions are Kafka's secret weapon for parallelism. If your topic has 4 partitions, you can have 4 consumers reading in parallel, each handling a different partition.

Events within a partition are strictly ordered. Events across partitions are not. This matters when ordering is important (e.g., all events for user #42 should be processed in order). You control which partition an event goes to by setting a key.

Producers

Producers publish events to topics. They're the writers. A producer sends a message (a key-value pair plus optional headers) to a specific topic. If you provide a key, Kafka hashes it to determine the partition. Same key always goes to the same partition, which guarantees ordering for that key.

Consumers and Consumer Groups

Consumers read events from topics. They're the readers. But here's the important part: consumers belong to consumer groups.

Within a consumer group, each partition is assigned to exactly one consumer. If you have 4 partitions and 2 consumers in the same group, each consumer handles 2 partitions. If you have 4 consumers, each handles 1 partition. If you have 5 consumers... one sits idle (you can't have more consumers than partitions).

Different consumer groups independently track their own position (offset) in the topic. So your analytics service and your notification service can each consume the same events at their own pace.

Offsets

Every event in a partition has a sequential number called an offset. Consumers track which offset they've processed. If a consumer crashes and restarts, it picks up from its last committed offset. No events lost.

Setting Up Kafka with Docker

Running Kafka locally used to be painful. Docker makes it reasonable:

# docker-compose.yml
version: '3.8'
services:
  kafka:
    image: confluentinc/confluent-local:7.6.0
    hostname: kafka
    container_name: kafka
    ports:
      - "9092:9092"
      - "9101:9101"
    environment:
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka:29093'
      KAFKA_LISTENERS: 'PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:29093'
      KAFKA_PROCESS_ROLES: 'broker,controller'
      KAFKA_NODE_ID: 1
      KAFKA_CLUSTER_ID: 'MkU3OEVBNTcwNTJENDM2Qk'

This uses KRaft mode (no ZooKeeper needed -- Kafka manages its own metadata since version 3.3).

docker-compose up -d

Verify it's running:

docker exec -it kafka kafka-topics --bootstrap-server localhost:9092 --list

Create your first topic:

docker exec -it kafka kafka-topics --bootstrap-server localhost:9092 \
  --create --topic user-events --partitions 3 --replication-factor 1

Producing and Consuming with Python

Install the confluent-kafka library -- it's a Python wrapper around Kafka's C client (librdkafka), so it's fast:

pip install confluent-kafka

Producing Events

# producer.py
import json
import time
from confluent_kafka import Producer

def delivery_report(err, msg):
    """Callback for message delivery confirmation."""
    if err is not None:
        print(f"Delivery failed: {err}")
    else:
        print(f"Delivered to {msg.topic()} [{msg.partition()}] @ offset {msg.offset()}")

producer = Producer({
    'bootstrap.servers': 'localhost:9092',
    'client.id': 'my-producer'
})

# Produce some events
events = [
    {"user_id": "u001", "action": "signup", "email": "alice@example.com"},
    {"user_id": "u002", "action": "signup", "email": "bob@example.com"},
    {"user_id": "u001", "action": "login", "ip": "192.168.1.10"},
    {"user_id": "u003", "action": "signup", "email": "carol@example.com"},
    {"user_id": "u002", "action": "login", "ip": "10.0.0.5"},
]

for event in events:
    producer.produce(
        topic='user-events',
        key=event["user_id"],        # Same user -> same partition -> ordered
        value=json.dumps(event),
        callback=delivery_report
    )
    producer.poll(0)  # Trigger delivery callbacks

# Wait for all messages to be delivered
producer.flush()
print("All messages delivered")

Key points:

The key determines the partition. All events for u001 go to the same partition, so they're processed in order.

poll(0) triggers delivery callbacks without blocking.

flush() blocks until all queued messages are delivered. Always call this before exiting.

Consuming Events

# consumer.py
import json
from confluent_kafka import Consumer, KafkaError

consumer = Consumer({
    'bootstrap.servers': 'localhost:9092',
    'group.id': 'user-event-processors',
    'auto.offset.reset': 'earliest',  # Start from beginning if no committed offset
    'enable.auto.commit': True,
})

consumer.subscribe(['user-events'])

print("Waiting for events... (Ctrl+C to stop)")
try:
    while True:
        msg = consumer.poll(timeout=1.0)

if msg is None:
            continue
        if msg.error():
            if msg.error().code() == KafkaError._PARTITION_EOF:
                print(f"Reached end of partition {msg.partition()}")
            else:
                print(f"Error: {msg.error()}")
            continue

event = json.loads(msg.value().decode('utf-8'))
        print(f"Received: partition={msg.partition()}, offset={msg.offset()}, "
              f"key={msg.key().decode('utf-8')}, event={event}")

# Process the event
        if event["action"] == "signup":
            print(f"  -> New user: {event.get('email', 'unknown')}")
        elif event["action"] == "login":
            print(f"  -> User login from {event.get('ip', 'unknown')}")

except KeyboardInterrupt:
    print("Shutting down...")
finally:
    consumer.close()

Run the consumer in one terminal, then run the producer in another. You'll see events flowing through.

Real-World Pattern: Order Processing Pipeline

Let's build something closer to reality -- an order processing system with multiple consumers:

# order_producer.py
import json
import uuid
import random
import time
from confluent_kafka import Producer

producer = Producer({'bootstrap.servers': 'localhost:9092'})

products = ["Widget A", "Gadget B", "Doohickey C", "Thingamajig D"]

def produce_order():
    order = {
        "order_id": str(uuid.uuid4())[:8],
        "customer_id": f"cust-{random.randint(100, 999)}",
        "product": random.choice(products),
        "quantity": random.randint(1, 5),
        "price": round(random.uniform(9.99, 99.99), 2),
        "timestamp": time.time()
    }

producer.produce(
        topic='orders',
        key=order["customer_id"],
        value=json.dumps(order)
    )
    producer.poll(0)
    return order

# Simulate incoming orders
for i in range(20):
    order = produce_order()
    print(f"Order placed: {order['order_id']} - {order['product']} x{order['quantity']}")
    time.sleep(0.5)

producer.flush()

Now create separate consumers for different concerns:

# inventory_consumer.py
import json
from confluent_kafka import Consumer

consumer = Consumer({
    'bootstrap.servers': 'localhost:9092',
    'group.id': 'inventory-service',      # Different group = independent consumption
    'auto.offset.reset': 'earliest',
})

consumer.subscribe(['orders'])

print("[Inventory Service] Listening for orders...")
try:
    while True:
        msg = consumer.poll(1.0)
        if msg is None or msg.error():
            continue

order = json.loads(msg.value().decode('utf-8'))
        print(f"[Inventory] Reducing stock: {order['product']} by {order['quantity']}")
except KeyboardInterrupt:
    consumer.close()

# notification_consumer.py
import json
from confluent_kafka import Consumer

consumer = Consumer({
    'bootstrap.servers': 'localhost:9092',
    'group.id': 'notification-service',   # Different group!
    'auto.offset.reset': 'earliest',
})

consumer.subscribe(['orders'])

print("[Notification Service] Listening for orders...")
try:
    while True:
        msg = consumer.poll(1.0)
        if msg is None or msg.error():
            continue

order = json.loads(msg.value().decode('utf-8'))
        print(f"[Notify] Sending confirmation to {order['customer_id']}: "
              f"Order {order['order_id']} placed")
except KeyboardInterrupt:
    consumer.close()

Run both consumers, then the producer. Each consumer processes every order independently. The inventory service and notification service don't know about each other. Add an analytics service next week? Just create a new consumer with a new group ID. No changes to anything else.

Manual Offset Management

Auto-commit is convenient but dangerous. If your consumer crashes after committing but before actually processing the message, that message is lost. For critical data, commit manually:

consumer = Consumer({
    'bootstrap.servers': 'localhost:9092',
    'group.id': 'careful-consumers',
    'auto.offset.reset': 'earliest',
    'enable.auto.commit': False,  # We'll handle this
})

consumer.subscribe(['orders'])

batch = []
BATCH_SIZE = 10

try:
    while True:
        msg = consumer.poll(1.0)
        if msg is None or msg.error():
            continue

batch.append(msg)

if len(batch) >= BATCH_SIZE:
            # Process the batch
            for m in batch:
                order = json.loads(m.value().decode('utf-8'))
                process_order(order)  # Your business logic

# Commit AFTER successful processing
            consumer.commit(asynchronous=False)
            print(f"Committed batch of {len(batch)} messages")
            batch = []

except KeyboardInterrupt:
    # Process remaining messages
    if batch:
        for m in batch:
            process_order(json.loads(m.value().decode('utf-8')))
        consumer.commit(asynchronous=False)
    consumer.close()

Kafka Admin Operations with Python

You can manage topics programmatically:

from confluent_kafka.admin import AdminClient, NewTopic

admin = AdminClient({'bootstrap.servers': 'localhost:9092'})

# Create a topic
new_topics = [
    NewTopic("analytics-events", num_partitions=6, replication_factor=1),
    NewTopic("error-logs", num_partitions=3, replication_factor=1),
]

fs = admin.create_topics(new_topics)
for topic, f in fs.items():
    try:
        f.result()
        print(f"Topic '{topic}' created")
    except Exception as e:
        print(f"Failed to create '{topic}': {e}")

# List topics
metadata = admin.list_topics(timeout=10)
for topic in metadata.topics:
    print(f"Topic: {topic}, Partitions: {len(metadata.topics[topic].partitions)}")

When Kafka Is Overkill

Kafka is powerful, but it's not always the right tool. Don't use Kafka when:

You have 2-3 services and simple HTTP calls work fine. Adding Kafka to a small system introduces complexity without much benefit.
You need request-response semantics. Kafka is fire-and-forget by design. If you need a response from the consumer, you're fighting the abstraction.
Your volume is low. Processing 100 events per day? A simple database queue or even Redis Pub/Sub will do.
You don't have the operational capacity. Kafka clusters need monitoring, disk management, and partition rebalancing. If you're a solo developer, consider managed alternatives (Confluent Cloud, AWS MSK, or Upstash Kafka).

Alternatives to consider for simpler use cases:

Redis Streams -- Lightweight, fast, good for moderate volumes
RabbitMQ -- Better for task queues and request-reply patterns
AWS SQS/SNS -- Fully managed, pay-per-message, zero ops
PostgreSQL LISTEN/NOTIFY -- If your events are already in Postgres

Common Mistakes

Not setting a message key. Without a key, events are round-robined across partitions. This breaks ordering guarantees. If order matters (and for user events, it usually does), set a key. Too many or too few partitions. You can't reduce partitions after creation. Start with 3-6 for development. In production, partition count roughly equals your maximum consumer parallelism. Ignoring consumer lag. If your consumers fall behind (consumer lag grows), events pile up. Monitor lag and scale consumers or fix slow processing. Not handling deserialization errors. A single malformed message can crash your consumer loop. Always wrap deserialization in try/except. Treating Kafka like a database. Kafka retains messages for a configurable period (default 7 days), not forever. It's a stream, not a data store. If you need permanent storage, consume events and write them to a database.

What's Next

This tutorial covers the fundamentals, but Kafka's ecosystem goes deep:

Kafka Connect -- Pre-built connectors for databases, S3, Elasticsearch, and dozens more
Schema Registry -- Enforce message schemas with Avro or Protobuf
Kafka Streams / ksqlDB -- Stream processing directly on Kafka data
Exactly-once semantics -- Transactional producers and idempotent consumers
Multi-cluster replication -- MirrorMaker 2 for geo-distributed setups

Start small. Get a producer and consumer working. Build a real feature with events. Then add complexity as your system demands it. The concepts here -- topics, partitions, consumer groups, offsets -- are the foundation everything else builds on.

For more backend architecture and Python tutorials, check out CodeUp.