Handling Concurrent Edits in Wiki-Style Revision Systems with SQLite
Issue Overview: Race Conditions in Revision-Based Editing Systems
The core challenge in implementing a wiki-style editing system with SQLite lies in preventing concurrent edits from overwriting each other’s changes. When two users submit edits to the same page nearly simultaneously, the database must ensure that only one revision is accepted as the latest, while the other user is notified of a conflict. The naive approach of checking the current revision before inserting a new one fails because the check and insertion are non-atomic operations, creating a race condition.
In this scenario, the application’s write
function performs two steps:
- Check if the latest revision ID matches the one the user based their edit on.
- Insert a new revision if the check passes.
These steps are vulnerable to interleaving execution threads. For example:
- User A reads revision #5 as the latest.
- User B reads revision #5 as the latest.
- User A inserts revision #6.
- User B’s check still passes (since their local copy of the latest revision is #5), and they insert revision #7, overwriting User A’s changes.
This violates the desired behavior where User B should be notified of the conflict after User A’s edit is committed. The problem is exacerbated in web applications where HTTP requests are handled concurrently by multiple workers (e.g., Gunicorn with threaded or multiprocess workers).
Possible Causes: Non-Atomic Operations and Isolation Level Misconfigurations
1. Non-Atomic Check-and-Insert Workflow
The write
function’s logic is split into separate database operations:
- A
SELECT
query to fetch the latest revision. - An
INSERT
query to add a new revision.
Between these operations, other transactions can modify the Revision
table, rendering the initial check invalid. SQLite’s default autocommit mode treats each statement as a transaction, so the SELECT
and INSERT
run in separate transactions unless explicitly wrapped in a transaction block.
2. Lack of Row-Level Locking in SQLite
SQLite uses a file-level lock for writes, which serializes all write transactions. However, this does not prevent race conditions in application logic where a transaction reads stale data. For example:
- Transaction A reads revision #5.
- Transaction B commits revision #6.
- Transaction A proceeds to insert revision #7, unaware of Transaction B’s changes.
This occurs because SQLite’s SERIALIZABLE
isolation level ensures transactional consistency but does not implicitly enforce application-level constraints like revision lineage.
3. Unique Constraint Limitations for Revision Lineage
A proposed solution involves adding a parent_id
column to enforce a unique chain of revisions. However, the first revision for any page would have a NULL
parent ID, violating uniqueness if multiple pages exist. Composite unique constraints (e.g., UNIQUE(page_id, parent_id)
) partially address this but require careful handling of NULL
values, which SQLite treats as distinct.
4. Misunderstanding of SELECT FOR UPDATE
Semantics
SQLite does not support SELECT FOR UPDATE
, which in other databases locks rows to prevent concurrent modifications. Developers often attempt to emulate this behavior with exclusive transactions, but without proper isolation, these attempts fail to prevent phantom reads or lost updates.
Troubleshooting Steps, Solutions & Fixes
Solution 1: Atomic Transactions with Pessimistic Locking
Step 1: Enclose Operations in an Explicit Transaction
Wrap the revision check and insertion in a single transaction using SQLAlchemy’s transaction management:
from sqlalchemy import text
def write(page_slug, user_revision_id, new_content):
with session.begin(): # Starts a transaction
# Fetch the latest revision WITH exclusive lock (emulated)
latest_rev = session.execute(
text("SELECT id FROM revision WHERE page_id = "
"(SELECT id FROM page WHERE slug = :slug) "
"ORDER BY created_at DESC LIMIT 1"),
{"slug": page_slug}
).scalar()
if latest_rev != user_revision_id:
raise ConflictError("Page has been modified by another user.")
# Insert new revision
new_rev = Revision(
page_id=page.id,
content=new_content,
parent_id=latest_rev
)
session.add(new_rev)
# Transaction commits here
Key Details:
- The
WITHOUT ROWID
clause is unnecessary here, but the explicit transaction ensures that theSELECT
andINSERT
are atomic. - SQLite’s file-level lock guarantees that no two write transactions occur concurrently, but read operations in other transactions may still see stale data.
Step 2: Use BEGIN EXCLUSIVE
for Critical Sections
To emulate row-level locking, start an exclusive transaction before reading the latest revision:
session.execute(text("BEGIN EXCLUSIVE"))
try:
latest_rev = get_latest_revision(page_slug)
if latest_rev != user_revision_id:
raise ConflictError()
create_new_revision(latest_rev)
session.commit()
except:
session.rollback()
raise
Trade-offs:
- Pros: Eliminates race conditions by blocking all other database accesses during the transaction.
- Cons: Reduces concurrency and may lead to timeouts or deadlocks under high load.
Solution 2: Optimistic Concurrency Control with Version Numbers
Step 1: Add a Version Column to the Page Table
Add an integer version
column to the Page
table, incremented on each revision:
ALTER TABLE page ADD COLUMN version INTEGER NOT NULL DEFAULT 0;
Step 2: Update the Version During Revision Insertion
Modify the write
function to check the version and increment it atomically:
def write(page_slug, user_version, new_content):
with session.begin():
page = session.execute(
text("SELECT id, version FROM page WHERE slug = :slug"),
{"slug": page_slug}
).fetchone()
if page.version != user_version:
raise ConflictError()
# Insert new revision
new_rev = Revision(page_id=page.id, content=new_content)
session.add(new_rev)
# Increment page version
session.execute(
text("UPDATE page SET version = version + 1 WHERE id = :id"),
{"id": page.id}
)
Key Details:
- The
version
column acts as a guard against concurrent updates. - The
UPDATE
statement increments the version atomically, ensuring that only one transaction can succeed per version.
Solution 3: Linear Revision History with Unique Constraints
Step 1: Add a parent_id
Column with Composite Uniqueness
Modify the Revision
table to include parent_id
and enforce uniqueness per page:
CREATE TABLE revision (
id INTEGER PRIMARY KEY,
page_id INTEGER NOT NULL,
parent_id INTEGER UNIQUE, -- Allows one child per parent
created_at TIMESTAMP,
content TEXT,
FOREIGN KEY (page_id) REFERENCES page(id),
FOREIGN KEY (parent_id) REFERENCES revision(id),
-- Ensure only one NULL parent per page (see Step 2)
UNIQUE (page_id, parent_id)
);
Step 2: Handle Initial Revisions with a Sentinel Value
Replace NULL
with a sentinel value (e.g., -1
) for the first revision’s parent_id
:
def create_page(slug, title, initial_content):
with session.begin():
page = Page(slug=slug, title=title)
session.add(page)
session.flush() # Assigns page.id
initial_rev = Revision(
page_id=page.id,
content=initial_content,
parent_id=-1 # Sentinel
)
session.add(initial_rev)
Key Details:
- The
UNIQUE (page_id, parent_id)
constraint ensures linear history. - The sentinel value avoids
NULL
uniqueness issues.
Solution 4: Conflict Detection with Triggers
Step 1: Create a Trigger to Validate Parent Revisions
Add a trigger to ensure that a new revision’s parent is the current latest:
CREATE TRIGGER validate_revision_parent
BEFORE INSERT ON revision
FOR EACH ROW
BEGIN
SELECT CASE
WHEN NEW.parent_id != (
SELECT id FROM revision
WHERE page_id = NEW.page_id
ORDER BY created_at DESC
LIMIT 1
)
THEN RAISE(ABORT, 'Parent revision is not the latest')
END;
END;
Key Details:
- The trigger runs within the inserting transaction, ensuring atomicity.
- Conflicts raise an error that rolls back the transaction, which the application must handle.
Solution 5: Application-Level Locking with Mutexes
Step 1: Use a Global Lock for Page Writes
Implement a per-page mutex to serialize writes:
from threading import Lock
page_locks = defaultdict(Lock)
def write(page_slug, ...):
with page_locks[page_slug]:
latest_rev = get_latest_revision(page_slug)
if latest_rev != user_revision_id:
raise ConflictError()
create_new_revision(...)
Trade-offs:
- Pros: Simple to implement in single-process environments.
- Cons: Fails with multiple Gunicorn workers or distributed systems.
Step 2: Use Advisory Locks in SQLite
Utilize SQLite’s sqlite3_update_hook
or PRAGMA locking_mode=EXCLUSIVE
for cross-process locks.
Final Recommendations
- Optimistic Concurrency Control is ideal for low-contention scenarios.
- Exclusive Transactions suit high-contention environments but reduce throughput.
- Composite Unique Constraints with sentinel values enforce revision lineage without triggers.
By combining atomic transactions, version columns, and careful constraint design, developers can implement robust wiki-style editing in SQLite while avoiding race conditions.