From Simple Queries to Concurrency Nightmares

As a developer, I used to think of a database as a simple box. You put data in, you get data out. I’d write my code, run my UPDATE statement, and trust that the “magic” inside the box would just handle it.

But then I started asking a dangerous question: What happens if two people try to do the same thing at the exact same time?

This single question took me on a journey deep into the hidden world of database concurrency, and what I found was both terrifying and brilliant.

The First Illusion: “My Statements Are Safe, Right?”

Let’s start with a simple bank balance. The account has $1000.

Two requests hit our server at the same time:

Request A wants to add $100.
Request B wants to add $200.

My code for this was a single, simple statement: UPDATE accounts SET balance = balance + 100.

I wondered, “Can this go wrong?” If the database were naive, two threads could both read the initial balance of $1000, one would calculate $1100, the other $1200, and the last one to write would win. The final balance would be $1200 instead of the correct $1300. Someone’s money would vanish.

But, as I learned, this is the very first problem a database is designed to solve. Every single statement is implicitly a transaction. The database uses short-lived locks to ensure that these single statements are isolated, effectively running them one after the other. Phew! My simple updates were safe.

The Trap: The Read-Modify-Write Cycle

Feeling confident, I moved on to more complex logic. What if I needed to read the balance first, do some checks in my application code, and then write it back?

My code looked something like this:

// 1. Read
int currentBalance = database.readBalance(123);

// 2. Modify (in my app)
if (currentBalance < 5000) {
    currentBalance += 100;
}

// 3. Write
database.writeBalance(123, currentBalance);

And this is where the nightmare began. I realized that the database’s protection on individual statements didn’t help me here at all. Two concurrent requests could both:

Read the balance of $1000.
Modify it in their own application memory.
Write it back.

The last one to write would overwrite the other’s work. The exact “Lost Update” problem I thought I was safe from was back, but this time, the database couldn’t automatically protect me.

The “Solution” That Wasn’t: A Simple Transaction

“Ah!” I thought. “This is what transactions are for!” I knew transactions were “all or nothing,” so I wrapped my logic:

BEGIN TRANSACTION;
SELECT balance FROM accounts WHERE id = 123;
-- ... my app logic happens here ...
UPDATE accounts SET balance = 1100 WHERE id = 123;
COMMIT;

I ran the scenario again. Transaction A starts, reads $1000. Then, a separate statement from Transaction B comes in and updates the balance to $1500. Then my Transaction A, completely unaware, finishes its logic based on the stale value of $1000 and writes back $1100. The update to $1500 was lost.

My transaction block didn’t help! This was a shocking revelation. It turns out, a transaction is a container, but what matters is the level of isolation you have inside it.

Unlocking the Levels: How Isolation Really Works

By default, most databases like PostgreSQL run at an isolation level called READ COMMITTED. It only guarantees that you won’t read the messy, half-finished work of another transaction. It makes no promise that the data won’t change the microsecond after you’ve read it.

To solve my problem, I needed a stricter level: REPEATABLE READ.

This is where I discovered the magic of MVCC (Multi-Version Concurrency Control). The database doesn’t just overwrite data; it keeps old versions of rows around. Each row has hidden xmin and xmax transaction IDs, marking which transaction created it and which “deleted” it.

When my REPEATABLE READ transaction begins, it takes a “snapshot” by noting the current transaction ID. For its entire life, it can only see row versions that were committed before its snapshot was taken.

Let’s see our example again with REPEATABLE READ:

Txn A starts (Snapshot at TXID 100). It reads the balance=1000 (created by an old TXID, say 50).
Txn B starts (Snapshot at TXID 101). It also reads balance=1000.
Txn A commits first. It updates the old row (xmax=100) and creates a new version: (balance=1100, xmin=100).
Txn B tries to commit. Now, the database checks: “You read a version of a row (from TXID 50) that has since been modified by another committed transaction (TXID 100). Your premise is stale!” To prevent a paradox, it rejects the commit and throws a serialization error.

Success! The lost update was prevented. The cost? My application now has to be smart enough to catch that error and retry the transaction.

The Final Showdown: Pessimistic vs. Optimistic

This led me to the two great philosophies of concurrency control.

Pessimistic Locking (SELECT ... FOR UPDATE): This is the “ask for permission first” approach. Inside a transaction, you explicitly lock the row you intend to update.
```
SELECT balance FROM accounts WHERE id = 123 FOR UPDATE;
```
This forces any other transaction trying to touch that row to wait in a single-file line. It’s simple and robust, but if you have a highly contended row, it turns your parallel database into a serial process for that one row.
Optimistic Locking (SERIALIZABLE Isolation): This is the “act first, ask for forgiveness” approach used by REPEATABLE READ and the even stronger SERIALIZABLE level. It’s built on MVCC. The database lets everyone work concurrently on their snapshots and only at the very end checks if their work created a logical contradiction. It’s fast when conflicts are rare, but can lead to a storm of retries if conflicts are common.

The Real Engineering Problem: The Hotspot

This brought me to my final challenge. What if you have a row that is always under high contention? Like a central bank account in a wallet service that every single deposit touches.

Pessimistic locking would be too slow, creating a massive traffic jam.
Optimistic locking would be a disaster, with nearly every transaction failing and retrying.

And here, I realized the most profound lesson: when the database’s built-in tools aren’t a good fit, the problem isn’t the database, it’s the workload. The solution must be architectural.

Instead of hitting the database for every transaction, we can:

Batching: Collect 100 deposit requests in a message queue and process them with a single, efficient database transaction. This reduces contention by 100x.
Sharding: Split the central account into 10 smaller accounts and spread the load.
Redesign: Change the data model entirely. Instead of updating a counter, insert new event records, which is a much lower-contention operation.

My journey from a simple UPDATE statement revealed a universe of complexity. The database isn’t just a box for data; it’s a sophisticated engine for managing concurrency. Our job as developers isn’t just to write queries, but to understand these powerful tools and design systems that work with them, not against them.