Cross-Process SQLite Change Notification Challenges in WAL Mode
Understanding Cross-Process Database Change Notification Limitations in SQLite
Fundamental Architecture Constraints in SQLite’s Notification System
SQLite does not natively support cross-process change notifications due to its design philosophy as an embedded database engine. The core issue arises from three architectural realities:
Process Isolation: SQLite operates within the memory space of the host process. Callbacks like
sqlite3_update_hook()
and trigger logic execute within the same process context where data modification occurs. A separate monitoring process has no visibility into these internal runtime states.WAL Mode Implications: While Write-Ahead Logging (WAL) allows concurrent reads/writes across processes, it does not expose low-level page modifications to external observers. The WAL file contains raw binary data without metadata about logical row changes – this is why solutions like Litestream capture entire pages rather than granular operations.
Trigger Scope Limitations: Database triggers fire only in the process initiating the data change. If Process A inserts a row, any attached triggers execute in Process A’s context. A separate Process B monitoring the database cannot intercept these trigger activations unless explicitly designed to poll for changes.
This creates a fundamental mismatch between SQLite’s single-process notification mechanisms and multi-process environments requiring real-time awareness of database mutations. The absence of a centralized coordination layer (unlike client-server databases) means cross-process communication must be implemented externally.
Critical Design Flaws in Common Notification Approaches
Four major pitfalls explain why standard solutions fail to achieve cross-process awareness:
1. Update Hook Misconceptions
The sqlite3_update_hook()
API registers a callback within the current database connection’s context. When Process A modifies data:
- The hook fires in Process A’s memory space
- Process B has no way to register its own hook for the same database
- Shared cache mode doesn’t resolve this – it merely allows multiple connections within the same process to share schema/data cache
Technical Demonstration:
// Process A
sqlite3_update_hook(db, processA_hook, NULL);
sqlite3_exec(db, "INSERT INTO tbl VALUES (1);", 0,0,0);
// processA_hook() fires here
// Process B (separate executable)
sqlite3_update_hook(db, processB_hook, NULL);
// Never triggers for Process A's INSERT
2. Trigger Invocation Scope
Triggers execute as part of the data modification transaction in the initiating process. Consider this scenario:
-- Created in Process B
CREATE TEMP TRIGGER log_changes
AFTER INSERT ON main.tbl
BEGIN
INSERT INTO audit_log VALUES (NEW.id);
END;
- Process A performs
INSERT INTO tbl ...
→ Trigger does not fire - Process B performs
INSERT INTO tbl ...
→ Trigger fires - TEMP triggers only exist for the connection that created them
3. Session Extension Limitations
The SQLite Session Extension tracks changes between database snapshots but requires explicit management:
sqlite3session_create(db, "main", &session);
sqlite3session_attach(session, "tbl");
// Generate changeset
sqlite3session_changeset(session, &nByte, &pChangeset);
- Changesets are process-local: Process B cannot subscribe to Process A’s session
- Requires polling the session object periodically
- Does not provide real-time streaming of operations
4. File System Monitoring Shortcomings
Watching the database file or WAL for changes (via inotify, kqueue, etc.) only indicates that a change occurred, not what changed. This forces:
- Full table scans to detect new/modified rows
- Reliance on
last_insert_rowid()
which isn’t transaction-safe - Race conditions when multiple writers exist
Robust Implementation Strategies for Cross-Process Change Capture
1. Trigger-Based Audit Logging with Optimized Polling
Step 1: Universal Audit Table Schema
CREATE TABLE IF NOT EXISTS _change_log (
id INTEGER PRIMARY KEY,
table_name TEXT NOT NULL,
op_type TEXT CHECK(op_type IN ('INSERT','UPDATE','DELETE')),
row_id INTEGER,
old_data BLOB,
new_data BLOB,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
);
Step 2: Dynamic Trigger Generation
Automate trigger creation for all user tables:
SELECT
'CREATE TRIGGER _audit_' || name || '_insert AFTER INSERT ON ' || name ||
' BEGIN INSERT INTO _change_log(table_name, op_type, row_id, new_data) ' ||
' VALUES (''' || name || ''', ''INSERT'', NEW.rowid, json_object(' ||
(SELECT group_concat('''' || name || ''', NEW.' || name, ', ')
FROM pragma_table_info(name)) || ')); END;'
FROM sqlite_master
WHERE type = 'table' AND name NOT LIKE 'sqlite_%';
Step 3: Efficient Polling with Data Version
Use PRAGMA data_version
to detect changes without full table scans:
#!/bin/bash
LAST_VERSION=0
while true; do
CURRENT_VERSION=$(sqlite3 db.sqlite "PRAGMA data_version;")
if [[ $CURRENT_VERSION -gt $LAST_VERSION ]]; then
sqlite3 db.sqlite "SELECT * FROM _change_log WHERE id > $LAST_LOG_ID;"
LAST_VERSION=$CURRENT_VERSION
fi
sleep 0.1 # Adjust polling interval
done
Performance Optimization:
- Use
WITHOUT ROWID
for the audit table if row counts exceed 1M - Partition
_change_log
by timestamp/hour - Enable
PRAGMA journal_mode=WAL
to allow concurrent writes
2. Custom VFS Layer for WAL Monitoring
Intercept WAL file operations by implementing a custom Virtual File System (VFS):
static int xWalWrite(
sqlite3_vfs *pVfs,
sqlite3_file *pFile,
const void *pBuf,
int amt,
sqlite3_int64 offset
){
// Call original WAL write method
int rc = pOriginalVfs->xWrite(pFile, pBuf, amt, offset);
if(rc == SQLITE_OK){
// Parse pBuf to detect logical changes
notify_external_process(pBuf, amt);
}
return rc;
}
Challenges:
- Requires deep understanding of WAL format (header, frame structure)
- Must handle checksum validation (WAL checksums use 32-bit big-endian)
- Risk of performance degradation if notification logic is slow
WAL Frame Analysis:
Offset | Size | Description |
---|---|---|
0 | 4 | Magic number (0x377f0682 or 0x377f0683) |
4 | 4 | File format version (3007000) |
8 | 4 | Page size |
12 | 4 | Checkpoint sequence number |
16 | 4 | Salt-1 (random increment on checkpoint) |
20 | 4 | Salt-2 (random number when WAL is created) |
24 | 4 | Checksum-1 |
28 | 4 | Checksum-2 |
Each subsequent frame contains:
- 24-byte header (page number, commit counter, salt)
- Page data (size from WAL header)
3. Raft-Based Consensus Protocol Implementation (Marmot Approach)
The Marmot solution demonstrates a production-grade approach combining triggers with distributed consensus:
Architecture Overview:
- Change Capture Layer: Triggers populate
_change_log
- Log Tailer: Reads from
_change_log
and streams to Raft - Raft Coordinator: Manages log replication across nodes
- Apply Layer: Applies replicated changes to follower nodes
Critical Components:
type ChangeRecord struct {
Sequence uint64
Table string
Operation string // INSERT/UPDATE/DELETE
RowID int64
Data []byte // JSON/BSON encoded
}
type RaftGroup struct {
ApplyFunc func(ChangeRecord) error
ProposeCh chan<- []byte
CommitCh <-chan *commit
}
func (m *MarmotNode) Run() {
for {
select {
case rec := <-m.logChan:
data, _ := json.Marshal(rec)
m.raft.ProposeCh <- data
case commit := <-m.raft.CommitCh:
for _, entry := range commit.Entries {
var cr ChangeRecord
json.Unmarshal(entry.Data, &cr)
applyChangeToDatabase(cr)
}
}
}
}
Conflict Resolution Strategy:
- Last-Writer-Wins: Uses logical timestamps with hybrid clocks (physical time + node ID)
- Consistent Hashing: Distributes tables across Raft groups to parallelize writes
const RaftShards = 16
func GetRaftGroup(table string, rowid int64) int {
hash := fnv.New64a()
hash.Write([]byte(table))
binary.Write(hash, binary.BigEndian, rowid)
return int(hash.Sum64() % RaftShards)
}
Performance Metrics:
- Median latency: 2.7ms per write (local node)
- Replication delay: 12ms p99 across regions
- Throughput: 1,200 writes/sec per shard (16 shards = 19k/sec total)
4. SQLite Session Extension with External Coordination
Combine the session extension with a message broker for cross-process sync:
Workflow:
- Process A enables session tracking
- On commit, serialize changeset to Kafka/RabbitMQ
- Process B consumes messages and applies changesets
# Producer (Process A)
session = sqlite3session(db, 'main', 'users')
changeset = session.changeset()
kafka_producer.send('db-changes', changeset)
# Consumer (Process B)
for msg in kafka_consumer:
sqlite3session_apply(db, msg.value)
Advantages:
- No trigger overhead
- Native conflict detection via session
CONFLICT
handlers - Works with any message queue supporting binary payloads
Session vs. Trigger Audit Comparison:
Feature | Session Extension | Trigger Audit |
---|---|---|
Overhead per write | Low | Medium-High |
Change granularity | Table-level | Row-level |
Conflict detection | Built-in | Manual |
Cross-process | Yes (with MQ) | Requires polling |
Schema modifications | Automatic | Manual updates |
5. Hybrid Approach: WAL Monitoring + Logical Decoding
For maximum performance, combine low-level WAL analysis with SQL parsing:
- WAL Listener: Tracks committed transactions
- SQL Parser: Converts WAL pages to logical operations
- Replication Log: Stores parsed operations for consumers
void parse_wal_frame(const uint8_t *frame, size_t len) {
uint32_t page_num = read_be32(frame);
uint64_t commit_counter = read_be64(frame + 4);
uint32_t salt = read_be32(frame + 12);
// Decrypt page using salt if encrypted
// Convert page data to SQL statements
char *sql = redo_sql_from_page(db_page);
emit_sql_event(sql, commit_counter);
}
Requirements:
- Access to original database schema for parsing heap pages
- Handling of
VACUUM
operations which rewrite page IDs - Checksum validation to detect corruption
Security Considerations:
- Use HMAC to verify WAL entries haven’t been tampered with
- Encrypt replication logs containing sensitive data
- Implement mutual TLS for cross-node communication
This guide provides multiple pathways to achieve cross-process change notification in SQLite, each with distinct trade-offs in complexity, performance, and reliability. For most applications, the trigger-based audit table with optimized polling offers the best balance between implementation simplicity and functional requirements. Large-scale distributed systems should consider the Raft-based approach as demonstrated in Marmot, while performance-critical applications may benefit from custom VFS/WAL analysis solutions.