Leader Election: Mermaid Sequence Diagram

About Source

Leader election is the process by which nodes in a distributed cluster collectively agree on a single node to act as the coordinator — the leader — responsible for making decisions, distributing work, and maintaining cluster-wide state.

What the diagram shows

This sequence diagram models leader election using a consensus-based approach (similar to etcd/Raft or ZooKeeper). Three nodes — Node 1, Node 2, and Node 3 — and a Coordination Service participate. On startup or after detecting that the previous leader has failed, each node sends a PUT /election/{nodeId} request with a TTL to the Coordination Service, attempting to write to the same key. The Coordination Service applies first-writer-wins semantics: Node 1 wins the election and receives a confirmation. Node 2 and Node 3 receive "election lost" responses.

Node 1, now the Leader, begins sending periodic heartbeat renewals to keep the election key from expiring. Node 2 and Node 3 enter Follower state, subscribing to the leader key via a watch. When the leader's heartbeat stops (crash or network partition), the TTL expires, the key is deleted, and the Coordination Service broadcasts a LeadershipLost event. The remaining nodes immediately start a new election round.

Why this matters

Leader election is foundational to any distributed system that requires a single authoritative coordinator — distributed job schedulers, database primary selection, and shard assignment all depend on it. The two failure modes to guard against are split-brain (two nodes both believe they are leader) and the election storm (nodes spin in tight election loops during instability). Using a TTL-based lease with a coordination service provides strong consistency guarantees that prevent split-brain. For the locking primitive that uses a similar mechanism, see Distributed Locking.

Frequently asked questions

Leader election is the process by which nodes in a distributed cluster collectively agree on a single node to act as the authoritative coordinator — responsible for making decisions, distributing work, and maintaining cluster-wide state — without manual intervention.

Each node attempts to acquire a leadership lease by writing to the same key in a coordination service (etcd, ZooKeeper) with a TTL. The first writer wins and becomes the leader, while other nodes enter follower state and watch the key for expiry. If the leader crashes, the TTL expires, the key is deleted, and all followers immediately start a new election.

Use leader election whenever you need exactly one node in a cluster to own a responsibility at a time — scheduling cron jobs, assigning database shard primaries, coordinating distributed writes, or managing singleton background processes across a multi-instance deployment.

mermaid

sequenceDiagram
    participant N1 as Node 1
    participant N2 as Node 2
    participant N3 as Node 3
    participant CS as Coordination Service

    note">Note over N1,CS: Election round starts (startup or leader failure)

    N1->&gt;CS: PUT /election/node1 TTL=10s
    N2->&gt;CS: PUT /election/node2 TTL=10s
    N3->&gt;CS: PUT /election/node3 TTL=10s

    CS-->&gt;N1: Election won - you are Leader
    CS-->&gt;N2: Election lost - follow Node 1
    CS-->&gt;N3: Election lost - follow Node 1

    note">Note over N1: Becomes Leader
    note">Note over N2,N3: Enter Follower state

    loop Every 5s (heartbeat)
        N1->&gt;CS: RENEW /election/node1 TTL=10s
        CS-->&gt;N1: Lease renewed
    end

    note">Note over N1: Node 1 crashes - heartbeat stops

    CS->&gt;N2: LeadershipLost event (TTL expired)
    CS->&gt;N3: LeadershipLost event (TTL expired)

    note">Note over N2,N3: New election round begins
    N2->&gt;CS: PUT /election/node2 TTL=10s
    N3->&gt;CS: PUT /election/node3 TTL=10s
    CS-->&gt;N2: Election won - you are new Leader
    CS-->&gt;N3: Election lost - follow Node 2