Competing Consumers Pattern: Mermaid Flowchart

Competing Consumers Pattern flowchart diagram

About Source

The competing consumers pattern is a message processing design in which multiple consumer instances read from a single queue, with each message delivered to exactly one consumer, enabling parallel processing and horizontal scaling of queue workloads.

When a single consumer cannot keep up with message throughput, the naive fix is to make it faster. The competing consumers pattern offers a better approach: run multiple identical consumers that all pull from the same queue. The queue guarantees that each message is delivered to exactly one consumer — whichever one requests it next. Consumers compete for messages, and the fastest available consumer wins the next item.

This model is fundamentally different from Pub Sub Messaging, where each subscriber gets a copy of every message. Here, a message is claimed by one consumer and removed from the queue, so increasing the consumer count increases throughput proportionally (up to the rate of message production).

Load balancing is automatic: a slow consumer (perhaps processing a complex task) simply fetches fewer messages per unit time, while fast consumers handle more. If a consumer crashes mid-processing, the message is re-queued (after its visibility timeout expires) and another consumer picks it up. This is why Idempotent Consumer design matters — a message may be delivered to a replacement consumer after the original partially processed it.

Scaling is elastic: add consumers during peak hours, remove them during off-peak. In Kubernetes, this maps directly to HPA-driven replica scaling based on queue depth metrics. The pattern is the backbone of virtually every background job system, explored further in Kafka Consumer Group for the Kafka-specific implementation.

Frequently asked questions

The competing consumers pattern runs multiple identical consumer instances against a single queue, with the queue guaranteeing each message is delivered to exactly one consumer. Consumers "compete" for the next available message, enabling parallel processing without coordination logic in application code.

Each consumer polls the queue for the next available message. The queue's lock mechanism ensures only one consumer claims a given message. A slow consumer simply claims fewer messages per second while faster consumers pick up the slack, providing automatic load balancing without any central scheduler.

Use it when a single consumer instance cannot keep up with message throughput and you need to scale processing horizontally. It is the standard approach for background job systems, image processing queues, order fulfilment pipelines, and any workload where tasks are independent and order does not matter globally.

The most common mistake is assuming idempotency: if a consumer crashes mid-processing, the message re-queues and another consumer picks it up — meaning the same message may be partially processed twice. Consumers must be designed as idempotent handlers. A second mistake is adding more consumers than partitions in Kafka — excess consumers sit idle since each partition is assigned to only one consumer.

mermaid

flowchart LR
    subgraph Producers
        P1[Producer A]
        P2[Producer B]
        P3[Producer C]
    end

    Q[Shared Work Queue\n12 pending messages]

    subgraph Consumers[Consumer Pool - 3 instances]
        C1[Consumer 1\nprocessing msg-3]
        C2[Consumer 2\nprocessing msg-7]
        C3[Consumer 3\nidle - waiting]
    end

    P1 -->|enqueue| Q
    P2 -->|enqueue| Q
    P3 -->|enqueue| Q

    Q -->|msg-3 locked| C1
    Q -->|msg-7 locked| C2
    Q -->|next available msg| C3

    C1 -->|ack on success| Q
    C2 -->|ack on success| Q

    C1 -->|fail - visibility timeout| Q
    Q -->|re-enqueue after timeout| Q