Back to BlogEngineering

Building Event-Driven Data Pipelines with Apache Kafka

How to design, build, and operate reliable event-driven architectures with Kafka — from schema design to exactly-once processing.

Sarah Chen Dec 18, 2025 11 min read
Kafka Event-Driven Data Pipelines Streaming
Building Event-Driven Data Pipelines with Apache Kafka

Apache Kafka has become the backbone of modern data architecture. At Vaarak, we've deployed Kafka clusters processing billions of events per day for logistics tracking, financial transactions, IoT telemetry, and real-time analytics. But Kafka's power comes with complexity — getting it right requires careful design decisions upfront.

Data streaming and processing visualization
Event-driven architectures decouple producers from consumers, enabling independent scaling and evolution

Topic Design: Get This Right First

Topic design is the most consequential decision in a Kafka deployment. Topics define your data contracts, partitioning strategy, and retention policy. A poorly designed topic structure leads to hot partitions, consumer group rebalancing storms, and ordering guarantees that are impossible to maintain.

  • One topic per event type (order.created, order.updated, payment.processed) — not one topic per entity with a type field
  • Choose partition keys carefully — use the entity ID (orderId, userId) to guarantee ordering within an entity
  • Set partition count based on expected throughput — each partition can handle ~10MB/s; over-partition early since you can't reduce partition count later
  • Use Avro or Protobuf schemas registered in Schema Registry — JSON is flexible but schema evolution is a nightmare without a registry
producers/order-events.ts
import { Kafka, Partitioners } from 'kafkajs';
import { SchemaRegistry } from '@kafkajs/confluent-schema-registry';

const kafka = new Kafka({ brokers: ['kafka-1:9092', 'kafka-2:9092', 'kafka-3:9092'] });
const registry = new SchemaRegistry({ host: 'http://schema-registry:8081' });
const producer = kafka.producer({ createPartitioner: Partitioners.DefaultPartitioner });

async function publishOrderEvent(event: OrderEvent) {
  const schemaId = await registry.getLatestSchemaId('order-events-value');
  const encodedValue = await registry.encode(schemaId, event);

  await producer.send({
    topic: 'order-events',
    messages: [{
      key: event.orderId,    // Partition key = orderId for ordering guarantee
      value: encodedValue,
      headers: {
        'event-type': event.type,
        'correlation-id': event.correlationId,
        'timestamp': Date.now().toString(),
      },
    }],
  });
}

Consumer Patterns: Idempotency and Exactly-Once

Kafka guarantees at-least-once delivery, which means consumers must be idempotent — processing the same message twice should produce the same result. The simplest approach is an idempotency key: store the event ID in your database within the same transaction as your business logic. If the event ID already exists, skip processing.

Exactly-once semantics in Kafka means exactly-once within the Kafka ecosystem (producer → topic → consumer). For end-to-end exactly-once (Kafka → external database), you need idempotent consumers with transactional outbox patterns.

Dead Letter Queues and Error Handling

Every production Kafka consumer needs a dead letter queue (DLQ). When a message fails processing after N retries, it's moved to a DLQ topic for manual investigation. Without a DLQ, a single poison message blocks the entire consumer group — one bad event stops all event processing.

Monitoring and Alerting

The most critical Kafka metric is consumer lag — the difference between the latest produced offset and the latest consumed offset. Rising consumer lag means your consumers can't keep up with producers, and you're falling behind on processing. Alert on sustained lag increases, not absolute values (a brief spike during a deployment is normal).

Kafka is not a database, a message queue, or a pub/sub system — it's an event log. Design your architecture around this mental model, and the patterns (event sourcing, CQRS, stream processing) follow naturally.

Sarah Chen, Vaarak Infrastructure
S

Sarah Chen

Cloud Infrastructure Architect