Read-Only SQLite Connections and Database Copy Safety: Transaction Risks Explained
Understanding Read-Only Connections and Database Copy Safety During Transactions
Issue Overview: The Intersection of Read-Only Transactions and File State Consistency
SQLite databases are widely used for their portability and reliability, but certain operational nuances can lead to data corruption if misunderstood. A critical question arises when combining read-only connections with file-copy operations: Does an active transaction on a read-only connection leave the database file in a "not-safe-to-copy" state?
Core Concepts: Safe-to-Copy States and Transaction Mechanics
A SQLite database enters a "safe-to-copy" state when:
- No transactions are active.
- The write-ahead log (WAL) is fully checkpointed into the main database file.
- No process holds exclusive locks (e.g., during writes).
The SQLite documentation explicitly warns that copying a database during an active transaction risks corruption. This applies to write transactions by default, as they modify the database or WAL file. However, the behavior of read-only transactions (e.g., SELECT
queries wrapped in BEGIN...COMMIT
) is less clear.
The Read-Only Connection Paradox
A read-only connection opened via mode=ro
URI parameter or SQLITE_OPEN_READONLY
flag cannot modify the database. However, SQLite’s locking and transaction system operates independently of file permissions:
- Filesystem-level read-only: The OS prevents writes to the file. SQLite may still attempt to create/shared locks.
- SQLite
mode=ro
: The engine enforces read-only behavior at the application layer, bypassing filesystem restrictions.
Key observations from the original discussion:
- Read-only connections do not alter the database file’s modification timestamp during transactions.
- Despite this, the documentation does not explicitly exempt read-only transactions from creating "not-safe-to-copy" states.
Hidden Risks with WAL Mode
In WAL mode, even read-only transactions interact with auxiliary files (-wal
and -shm
). A read-only connection may:
- Read uncheckpointed data from the WAL file.
- Hold shared locks to maintain transaction isolation.
- Prevent checkpointing if other connections are active.
Copying only the main database file while ignoring WAL/shared-memory files in this state creates an inconsistent snapshot. This risk persists even if the connection is read-only.
Possible Causes: Why Read-Only Transactions Might Compromise Copy Safety
1. Lock Acquisition and File Descriptor Behavior
SQLite uses a locking protocol to manage concurrency. Read-only connections acquire shared locks during transactions:
- Shared lock (SQLITE_LOCK_SHARED): Allows multiple readers to coexist.
- Reserved lock (SQLITE_LOCK_RESERVED): Held by writers preparing to commit.
When a read-only transaction starts:
- The connection obtains a shared lock on the database file.
- In WAL mode, it may access the WAL index in the
-shm
file. - The OS retains an open file descriptor to the database, WAL, and SHM files.
Implication: A file copy operation (e.g., cp
, rsync
) executed while these locks/file descriptors are active may capture partially written pages or orphaned WAL entries.
2. Filesystem Permissions vs. SQLite’s Internal Logic
A common misconception is that filesystem-level read-only permissions eliminate copy risks. This is false because:
- WAL/shared-memory files are separate: Even if the main database file is read-only, SQLite may write to
-wal
/-shm
files if they reside in a writable directory. - Locking is independent of file modification: Locks are managed via the filesystem’s advisory locking mechanism, not file content changes.
Example Scenario:
- Database
app.db
is marked read-only at the OS level. - A read-only connection opens
app.db
in WAL mode. - The connection starts a transaction, reading data from
app.db-wal
. - Another user copies
app.db
but notapp.db-wal
. - The copied database is missing uncheckpointed data, rendering it corrupt.
3. Transaction Isolation and Snapshot Validity
SQLite’s read transactions operate under "repeatable read" isolation by default. To achieve this:
- The connection maintains a snapshot of the database state at the transaction’s start.
- In WAL mode, this snapshot relies on the WAL file’s contents.
Critical Insight: The snapshot is logical, not physical. Copying the database file does not replicate the transaction’s snapshot context. If the original database’s WAL is modified post-copy, the copied file becomes inconsistent.
Resolving the Dilemma: Safe Copy Strategies and Mitigation Techniques
Step 1: Use the SQLite Online Backup API
The sqlite3_backup_*
API is the only method guaranteed to produce a consistent copy, regardless of connection type. It works by:
- Reading database pages directly from the source connection.
- Bypassing filesystem-level locks and WAL complexities.
- Operating incrementally to avoid blocking other connections.
Implementation Example (C API):
sqlite3 *src_db;
sqlite3_open_v2("file:app.db?mode=ro", &src_db, SQLITE_OPEN_READONLY, NULL);
sqlite3 *dst_db;
sqlite3_open("backup.db", &dst_db);
sqlite3_backup *backup = sqlite3_backup_init(dst_db, "main", src_db, "main");
if (backup) {
sqlite3_backup_step(backup, -1); // Copy all pages
sqlite3_backup_finish(backup);
}
Step 2: Enforce Checkpointing Before Manual Copies
If the backup API is unavailable (e.g., in shell scripts), force a WAL checkpoint:
-- From a read-write connection:
PRAGMA wal_checkpoint(FULL);
Post-Checkpoint Verification:
- Confirm
app.db-wal
is 0 bytes. - Ensure no transactions are active (check
sqlite3_db_status()
orPRAGMA lock_status
).
Step 3: Validate Filesystem-Level Copy Procedures
When manual copying is unavoidable:
- Include all auxiliary files:
cp app.db app.db-wal app.db-shm backup/
- Use atomic filesystem operations:
sqlite3 app.db ".backup backup.db"
- Leverage
VACUUM INTO
for self-contained copies:VACUUM INTO 'backup.db';
Step 4: Monitor Active Transactions and Locks
Implement pre-copy checks to avoid unsafe states:
- Query
sqlite3_temp_directory
: Ensure no transient objects exist. - Check
PRAGMA lock_status
:-- Returns held locks for all attached databases PRAGMA lock_status;
- Audit open file descriptors (Linux):
lsof +D /path/to/database
Step 5: Test with Long-Running Simulations
Replicate the original user’s testing strategy:
- Spawn multiple read-only connections executing
SELECT
transactions. - Run concurrent copy processes using
cp
,rsync
, and backup APIs. - Validate copied databases via integrity checks:
PRAGMA quick_check;
Sample Test Framework (Python):
import sqlite3
import threading
import shutil
def run_read_transaction(db_path):
conn = sqlite3.connect(f'file:{db_path}?mode=ro', uri=True)
cursor = conn.cursor()
cursor.execute("BEGIN")
cursor.execute("SELECT COUNT(*) FROM some_table")
cursor.fetchall()
cursor.execute("COMMIT")
conn.close()
def copy_db(src, dst):
shutil.copy(src, dst)
# Verify integrity
verify_conn = sqlite3.connect(dst)
verify_conn.execute("PRAGMA quick_check").fetchone()
verify_conn.close()
# Repeat these in parallel threads
Final Recommendations for Developers
- Default to the backup API for all production-critical copies.
- Document user responsibilities when allowing manual file operations.
- Prefer
VACUUM INTO
over filesystem copies for simplicity. - Monitor SQLite’s error logs for
SQLITE_CORRUPT
orSQLITE_READONLY
warnings.
By addressing the interplay of transactions, locking, and file states, developers can mitigate corruption risks while accommodating legitimate use cases for read-only connections.