Cluster Coordination Architecture: Mermaid Diagram

Cluster Coordination Architecture flowchart diagram

About Source

Cluster coordination architecture describes the set of services and protocols that allow multiple nodes in a distributed system to agree on shared state — including cluster membership, configuration, distributed locks, and leader identity — providing a consistent foundation for higher-level applications.

Coordination is hard because it requires strong consistency in the face of network partitions and node failures. Systems that need cluster coordination typically rely on a dedicated coordination service (ZooKeeper, etcd, Consul) backed by a consensus algorithm like Raft to guarantee linearizability.

Coordination Service: The coordination service (e.g., etcd) runs as a small cluster of 3 or 5 nodes. Its own internal consensus guarantees that writes are durable and reads are linearizable. Application services talk to this cluster through a client library, watching keys for changes with low-latency push notifications.

Distributed Locking: Services that need mutual exclusion (e.g., only one instance should run a migration at a time) acquire a lease from the coordination service. The lease is associated with a time-to-live (TTL); if the holder crashes, the lock expires automatically. A fencing token (monotonically increasing version number) prevents a slow lock holder from corrupting data after the lease expires. See Distributed Locking for the full protocol.

Service Discovery and Configuration: Workers register their address and health status in the coordination service. Clients watch these registrations and update their routing tables when workers join or leave. Global configuration values (feature flags, rate limits) are also stored here, enabling zero-downtime configuration changes pushed to all instances simultaneously.

Leader Election in Applications: Application-level leader election (distinct from the coordination service's internal election) is implemented by multiple instances competing to create the same key with a TTL. The instance that creates the key first wins; others watch the key and take over if it disappears. Compare with Leader Election Algorithm for the lower-level Raft-based mechanism.

Frequently asked questions

Cluster coordination architecture is the set of services and protocols that allow distributed nodes to agree on shared state — including cluster membership, distributed locks, configuration values, and leader identity. It provides a consistent, fault-tolerant foundation that higher-level application services rely on to make safe decisions without stepping on each other.

A coordination service such as etcd, ZooKeeper, or Consul runs as a small cluster (typically 3 or 5 nodes) backed by a consensus algorithm like Raft. Clients write and watch keys through a client library. The consensus layer guarantees linearisable reads and durable writes, while push notifications (watches) allow clients to react to membership and configuration changes with low latency.

Use a dedicated coordination service whenever multiple services or instances need to agree on something — who holds a distributed lock, which instance is the leader, or what the current configuration value is. Trying to implement these primitives directly in application code without a consensus-backed service leads to subtle race conditions and split-brain scenarios under partition.

The most dangerous mistake is ignoring fencing tokens: a slow lock holder can continue acting after its lease expires and corrupt shared state. Other common issues include running the coordination service with an even number of nodes (reducing fault tolerance), watching too many keys per client (causing thundering-herd reconnects), and relying on TTL-based locks without handling clock skew.

Cluster coordination services (etcd, ZooKeeper) provide general-purpose primitives — key-value storage, watches, and leases — backed by a consensus algorithm for durability. Two-phase commit (2PC) is a specific protocol for atomically committing or aborting a transaction across multiple independent resource managers. Coordination services are used to implement distributed locks and leader election; 2PC is used to achieve atomic cross-service writes in distributed transactions.

mermaid

flowchart TD
    subgraph CoordCluster ["Coordination Cluster — etcd — 3 nodes — Raft consensus"]
        E1[etcd Leader]
        E2[etcd Follower 1]
        E3[etcd Follower 2]
        E1 <-->|Raft replication| E2
        E1 <-->|Raft replication| E3
    end

    subgraph AppLayer ["Application Services"]
        A1[Service A — Instance 1]
        A2[Service A — Instance 2]
        A3[Service B — Worker]
        A4[Service B — Worker]
    end

    subgraph UseCases ["Coordination Use Cases"]
        UC1["Distributed Lock\n- acquire lease with TTL\n- fencing token prevents stale writes"]
        UC2["Service Discovery\n- register address + health\n- watch prefix for changes"]
        UC3["Leader Election\n- compete to create key\n- winner is app leader"]
        UC4["Config Management\n- store feature flags\n- push updates to all watchers"]
    end

    A1 -->|gRPC watch| E1
    A2 -->|gRPC watch| E1
    A3 -->|gRPC watch| E2
    A4 -->|gRPC watch| E3

    E1 --> UC1
    E1 --> UC2
    E1 --> UC3
    E1 --> UC4

    subgraph HealthCheck ["Health and Membership"]
        H1[Worker registers health key\nwith TTL=30s] --> H2[Worker refreshes TTL\nevery 10s]
        H2 --> H3{TTL expires\nno refresh?}
        H3 -->|Yes — worker dead| H4[Key deleted — watchers\nnotified of departure]
        H3 -->|No| H2
    end

    style CoordCluster fill:#eff6ff,stroke:#3b82f6
    style UseCases fill:#f0fdf4,stroke:#22c55e
    style HealthCheck fill:#fef3c7,stroke:#f59e0b