MVCC Workflow: Mermaid Sequence Diagram

About Source

Multi-Version Concurrency Control (MVCC) is a concurrency strategy used by databases like PostgreSQL, MySQL InnoDB, and Oracle to allow readers and writers to operate on the same data simultaneously without blocking each other.

This sequence diagram shows two concurrent transactions — a reader (Transaction A) and a writer (Transaction B) — operating on the same row. When Transaction B writes an updated value to a row, the database does not overwrite the existing version. Instead, it creates a new row version tagged with B's transaction ID and a xmin timestamp. The old version remains visible in the heap, tagged with its original xmin and with xmax set to B's transaction ID to indicate it is being superseded.

Transaction A, which started before B's write, holds a snapshot taken at its BEGIN time. When A reads the row, the database evaluates visibility rules: it checks whether the row version's xmin is committed and visible to A's snapshot, and whether the version's xmax is either absent or refers to a transaction that was not yet committed when A's snapshot was taken. The old version satisfies these rules — so A sees the pre-update value. The new version created by B is not visible to A because B's transaction ID is newer than A's snapshot horizon.

This is what makes MVCC powerful: readers never block writers and writers never block readers. The cost is storage overhead — every update creates a new version, and old versions accumulate until the database's garbage collector (VACUUM in PostgreSQL) identifies and reclaims versions no longer needed by any active transaction.

MVCC underpins the Isolation levels described in ACID Properties. Read Committed mode takes a new snapshot per statement; Repeatable Read takes a snapshot once at BEGIN and holds it for the transaction's duration, as shown in this diagram.

Frequently asked questions

MVCC is a concurrency strategy used by databases like PostgreSQL, MySQL InnoDB, and Oracle to allow readers and writers to operate on the same data simultaneously without blocking each other. Instead of locking rows for reads, the database maintains multiple versions of each row and serves each transaction the version that was current when its snapshot was taken.

When a transaction writes a row, the database creates a new version tagged with the writing transaction's ID (xmin), leaving the old version intact with an xmax marker pointing to the new transaction. When a reading transaction accesses the row, the database evaluates visibility rules against the reader's snapshot: it serves the newest version whose xmin is committed and visible to the snapshot, and whose xmax either does not exist or belongs to a transaction that was not yet committed when the snapshot was taken.

In locking-based concurrency control, a read acquires a shared lock that blocks concurrent writers, and a write acquires an exclusive lock that blocks concurrent readers. Under MVCC, readers never block writers and writers never block readers — each transaction sees a consistent snapshot of the database as of its start time. The trade-off is storage overhead: MVCC accumulates old row versions that must be periodically reclaimed by a garbage collector (VACUUM in PostgreSQL).

Table bloat is the most common issue in PostgreSQL: if VACUUM cannot run frequently enough (due to long-running transactions holding back the oldest snapshot horizon), dead row versions accumulate and inflate table size. Another issue is transaction ID wraparound in PostgreSQL, which requires proactive monitoring and aggressive vacuuming. In MySQL InnoDB, the undo log grows large during long transactions and can cause performance degradation.

mermaid

sequenceDiagram
    participant TxA as Transaction A (Reader)
    participant Engine as DB Engine
    participant Heap as Table Heap
    participant TxB as Transaction B (Writer)

    TxA->&gt;Engine: BEGIN (snapshot at txid=100)
    TxB->&gt;Engine: BEGIN (txid=101)

    TxA->&gt;Engine: SELECT balance FROM accounts WHERE id=1
    Engine->&gt;Heap: Find row versions for id=1
    Heap-->&gt;Engine: Version v1 (xmin=50, xmax=null, balance=1000)
    Engine->&gt;Engine: Visibility check: xmin=50 committed, xmax null — visible to txid=100
    Engine-->&gt;TxA: balance = 1000

    TxB->&gt;Engine: UPDATE accounts SET balance=750 WHERE id=1
    Engine->&gt;Heap: Create new version v2 (xmin=101, xmax=null, balance=750)
    Engine->&gt;Heap: Mark old version v1 xmax=101
    Engine-->&gt;TxB: 1 row updated

    TxB->&gt;Engine: COMMIT
    Engine->&gt;Engine: Mark txid=101 committed in transaction log

    TxA->&gt;Engine: SELECT balance FROM accounts WHERE id=1
    Engine->&gt;Heap: Find row versions for id=1
    Heap-->&gt;Engine: v1 (xmin=50, xmax=101) and v2 (xmin=101, xmax=null)
    Engine->&gt;Engine: v1 xmax=101 committed after snapshot — still visible to A
    Engine->&gt;Engine: v2 xmin=101 newer than snapshot — not visible to A
    Engine-->&gt;TxA: balance = 1000 (repeatable read)

    TxA->&gt;Engine: COMMIT