diagram.mmd — flowchart
Kafka Partitioning flowchart diagram

Kafka partitioning is the mechanism by which a topic's message stream is divided into ordered, immutable sublogs that can be stored and processed in parallel across multiple brokers.

A Kafka topic is a logical channel, but data is never stored at the topic level — it lives in partitions. Each partition is an append-only, ordered log that resides on a single broker (the leader) and is replicated to one or more other brokers (the followers). Partitioning achieves two goals simultaneously: it allows topics to scale beyond a single machine's I/O capacity, and it provides the unit of parallelism for consumers.

When a producer sends a record, the partitioner decides which partition receives it. The default strategy hashes the message key: records with the same key always land in the same partition, which is the primary tool for enforcing ordering by entity (e.g., all events for user-123 go to partition 2). Without a key, the partitioner uses round-robin, distributing load evenly but abandoning per-entity order guarantees. See Message Ordering Guarantee for how this plays out downstream.

Replication works at the partition level. The leader handles all reads and writes; followers pull and replicate. If the leader broker fails, one of the in-sync replicas (ISR) is elected as the new leader — a process managed by ZooKeeper (legacy) or KRaft (modern). The min.insync.replicas setting controls how many replicas must confirm a write before the producer receives an ack, directly trading latency for durability.

Understanding partitioning is prerequisite knowledge for tuning the Kafka Consumer Group assignment (you can't have more consumers than partitions) and for designing the Event Streaming Architecture that sits around it.

Free online editor
Edit this diagram in Graphlet
Fork, modify, and export to SVG or PNG. No sign-up required.
Open in Graphlet →

Frequently asked questions

Kafka partitioning is the mechanism by which a topic's message stream is divided into ordered, immutable sublogs that can be stored and processed in parallel across multiple brokers. Each partition is an append-only log residing on one broker (the leader) and replicated to followers, giving Kafka both horizontal scalability and fault tolerance.
When a producer sends a record, the partitioner assigns it to a specific partition. With a message key, records are assigned by hashing the key — ensuring all events for the same entity always land in the same partition. Without a key, round-robin distribution is used. On the consumer side, each partition is assigned to exactly one consumer per group, guaranteeing in-order processing within each partition.
Partition count determines the maximum parallelism — you cannot have more active consumers in a group than partitions. Over-partitioning increases metadata overhead on the broker; under-partitioning creates throughput bottlenecks. A common starting point is to size partitions for your expected peak throughput per partition, with room to scale consumer count to match.
A common mistake is choosing a partition key with low cardinality — for example, a boolean field — which causes all messages to pile into one or two partitions, creating hot spots. Another pitfall is not accounting for replication factor when sizing cluster storage. Teams also underestimate the operational cost of increasing partition count after a topic is in production, which triggers a partition reassignment that temporarily increases broker load.
mermaid
flowchart LR Producer -->|key=user-A| Partitioner Producer -->|key=user-B| Partitioner Producer -->|no key| Partitioner Partitioner -->|hash user-A → P0| P0[Partition 0\nLeader: Broker 1] Partitioner -->|hash user-B → P1| P1[Partition 1\nLeader: Broker 2] Partitioner -->|round-robin → P2| P2[Partition 2\nLeader: Broker 3] P0 -->|replicate| R0A[Partition 0 Replica\nBroker 2] P0 -->|replicate| R0B[Partition 0 Replica\nBroker 3] P1 -->|replicate| R1A[Partition 1 Replica\nBroker 1] P1 -->|replicate| R1B[Partition 1 Replica\nBroker 3] P2 -->|replicate| R2A[Partition 2 Replica\nBroker 1] P2 -->|replicate| R2B[Partition 2 Replica\nBroker 2]
Copied to clipboard