Kafka Producer Consumer Flow: Mermaid Diagram

Kafka Producer Consumer Flow sequence diagram

About Source

A Kafka producer-consumer flow describes the end-to-end journey of a message from the application that generates it to the application that processes it, with an Apache Kafka cluster acting as the durable, ordered buffer in between.

The flow begins when a producer publishes a record to a named topic. Before the write is acknowledged, Kafka routes the record to a specific partition within that topic — either by hashing the message key or using a round-robin strategy if no key is set. See Kafka Partitioning for how this assignment works. The partition leader (a broker) writes the record to its local log and then replicates it to follower brokers according to the topic's replication factor.

Once the in-sync replicas (ISR) have persisted the record, Kafka sends an acknowledgment back to the producer. The producer's acks setting controls this: acks=0 means fire-and-forget, acks=1 means the leader alone confirms, and acks=all waits for all ISR members — the safest choice for durability.

On the read side, consumers in a Kafka Consumer Group poll partitions continuously. Each consumer maintains an offset — the position of the last record it has processed in each partition. After handling a batch of records, the consumer commits its offset back to the __consumer_offsets topic. If the consumer crashes and restarts, it resumes from the last committed offset, meaning messages are reprocessed at most once (with enable.auto.commit=false) or potentially duplicated if the commit failed — leading directly to the need for Exactly Once Delivery semantics.

Understanding this flow is foundational for diagnosing lag (when consumers fall behind producers), tuning throughput (batch sizes, linger.ms), and reasoning about failure modes. It also underpins higher-level patterns like Event Streaming Architecture and Stream Processing Pipeline.

Frequently asked questions

The Kafka producer-consumer flow is the end-to-end journey of a message from the application that generates it, through the Kafka broker cluster, to the application that processes it. The producer writes to a topic partition, the broker replicates and acknowledges the write, and the consumer polls and commits offsets to track progress through the partition log.

A producer publishes a record to a topic. Kafka routes it to a partition (by key hash or round-robin), the partition leader writes it to disk, and followers replicate it. Once the in-sync replicas confirm the write, the broker acknowledges the producer. Consumers in a consumer group poll their assigned partitions, process records, then commit offsets back to `__consumer_offsets` so restarts resume from the correct position.

Kafka excels when you need durable, replayable, high-throughput message streaming — typically millions of events per second. It is the right choice for event streaming architectures, stream processing pipelines, audit logging, and change-data-capture. For simpler task queue workloads with lower volume and no replay requirement, a traditional message broker like RabbitMQ or SQS may be simpler to operate.

A common mistake is using `acks=1` (leader-only acknowledgment) when durability is required — a broker failure after leader write but before follower replication can cause data loss. Another pitfall is `enable.auto.commit=true`, which commits offsets on a timer rather than after processing, risking lost messages if a crash occurs between auto-commit and processing. Teams also overlook consumer lag monitoring, which is the primary signal that consumers are falling behind producers.

Kafka is a distributed log optimised for high-throughput, durable, replayable event streaming. Messages are retained for a configurable period regardless of consumer state, enabling replay and multiple independent consumer groups. RabbitMQ is a message broker optimised for flexible routing — using exchanges, bindings, and routing keys — where messages are typically removed from the queue after acknowledgment. Kafka suits event streaming and audit; RabbitMQ suits task queues with complex routing logic and lower volume.

mermaid

sequenceDiagram
    participant Producer as Producer App
    participant Broker as Kafka Broker (Leader)
    participant Replica as Kafka Broker (Follower)
    participant CG as Consumer Group
    participant Consumer as Consumer App

    Producer->&gt;Broker: Send record (topic=orders, key=user-123)
    note">Note over Broker: Assign to partition by hash(key)
    Broker->&gt;Broker: Write record to partition log (offset 42)
    Broker->&gt;Replica: Replicate record to follower
    Replica-->&gt;Broker: Ack: replication complete
    Broker-->&gt;Producer: Ack: offset 42 committed (acks=all)

    Consumer->&gt;CG: Join consumer group
    CG-->&gt;Consumer: Assign partition 0
    Consumer->&gt;Broker: Fetch records from offset 42
    Broker-->&gt;Consumer: Return batch (offsets 42-49)
    Consumer->&gt;Consumer: Process records
    Consumer->&gt;Broker: Commit offset 50
    Broker-->&gt;Consumer: Offset committed