diagram.mmd — sequence
Leader Election sequence diagram

Leader election is the process by which nodes in a distributed cluster collectively agree on a single node to act as the coordinator — the leader — responsible for making decisions, distributing work, and maintaining cluster-wide state.

What the diagram shows

This sequence diagram models leader election using a consensus-based approach (similar to etcd/Raft or ZooKeeper). Three nodes — Node 1, Node 2, and Node 3 — and a Coordination Service participate. On startup or after detecting that the previous leader has failed, each node sends a PUT /election/{nodeId} request with a TTL to the Coordination Service, attempting to write to the same key. The Coordination Service applies first-writer-wins semantics: Node 1 wins the election and receives a confirmation. Node 2 and Node 3 receive "election lost" responses.

Node 1, now the Leader, begins sending periodic heartbeat renewals to keep the election key from expiring. Node 2 and Node 3 enter Follower state, subscribing to the leader key via a watch. When the leader's heartbeat stops (crash or network partition), the TTL expires, the key is deleted, and the Coordination Service broadcasts a LeadershipLost event. The remaining nodes immediately start a new election round.

Why this matters

Leader election is foundational to any distributed system that requires a single authoritative coordinator — distributed job schedulers, database primary selection, and shard assignment all depend on it. The two failure modes to guard against are split-brain (two nodes both believe they are leader) and the election storm (nodes spin in tight election loops during instability). Using a TTL-based lease with a coordination service provides strong consistency guarantees that prevent split-brain. For the locking primitive that uses a similar mechanism, see Distributed Locking.

Free online editor
Edit this diagram in Graphlet
Fork, modify, and export to SVG or PNG. No sign-up required.
Open in Graphlet →

Frequently asked questions

Leader election is the process by which nodes in a distributed cluster collectively agree on a single node to act as the authoritative coordinator — responsible for making decisions, distributing work, and maintaining cluster-wide state — without manual intervention.
Each node attempts to acquire a leadership lease by writing to the same key in a coordination service (etcd, ZooKeeper) with a TTL. The first writer wins and becomes the leader, while other nodes enter follower state and watch the key for expiry. If the leader crashes, the TTL expires, the key is deleted, and all followers immediately start a new election.
Use leader election whenever you need exactly one node in a cluster to own a responsibility at a time — scheduling cron jobs, assigning database shard primaries, coordinating distributed writes, or managing singleton background processes across a multi-instance deployment.
mermaid
sequenceDiagram participant N1 as Node 1 participant N2 as Node 2 participant N3 as Node 3 participant CS as Coordination Service note">Note over N1,CS: Election round starts (startup or leader failure) N1->>CS: PUT /election/node1 TTL=10s N2->>CS: PUT /election/node2 TTL=10s N3->>CS: PUT /election/node3 TTL=10s CS-->>N1: Election won - you are Leader CS-->>N2: Election lost - follow Node 1 CS-->>N3: Election lost - follow Node 1 note">Note over N1: Becomes Leader note">Note over N2,N3: Enter Follower state loop Every 5s (heartbeat) N1->>CS: RENEW /election/node1 TTL=10s CS-->>N1: Lease renewed end note">Note over N1: Node 1 crashes - heartbeat stops CS->>N2: LeadershipLost event (TTL expired) CS->>N3: LeadershipLost event (TTL expired) note">Note over N2,N3: New election round begins N2->>CS: PUT /election/node2 TTL=10s N3->>CS: PUT /election/node3 TTL=10s CS-->>N2: Election won - you are new Leader CS-->>N3: Election lost - follow Node 2
Copied to clipboard