SQLite WAL File Disappearance Due to Multiple Library Instances and Connection Handling
Issue Overview: WAL File Absence Despite Active Connections in Multi-Component Environments
The core problem revolves around the unexpected disappearance of the SQLite Write-Ahead Logging (WAL) file (-wal
) in an Android application using a hybrid C++/Lua codebase. The WAL file either becomes empty (0 bytes) or vanishes entirely during runtime, even when transactions are actively being committed and connections are presumed to remain open. This behavior contradicts SQLite’s documented WAL mechanics, where the -wal
file should persist as long as at least one database connection is active. The issue is observed in an environment with two distinct components:
- A C++/NDK layer handling primary database operations (inserts, selects) with explicit transaction control.
- Lua scripts executing secondary queries, which open and close their own database connections.
Key observations from the debugging process:
- The WAL file disappears specifically when Lua script connections are closed, even if the C++ connection is assumed to remain open.
- Data integrity is preserved: Checkpoints (
PRAGMA wal_checkpoint
) transfer committed transactions to the main database file, confirming that writes are not lost. - The problem vanishes in minimal reproducible examples but resurfaces in the full codebase, implicating interactions between components.
- SQLite version mismatches between the C++ and Lua layers were eventually identified as the root cause.
Possible Causes: Multi-Instance SQLite Libraries and Connection Lifecycle Conflicts
Three primary factors contribute to the disappearance of the WAL file:
1. Multiple SQLite Library Instances
SQLite tracks open connections per library instance. If the C++ and Lua layers link against separate SQLite binaries (or dynamically load different versions), each instance maintains its own internal connection registry. Consequently:
- Closing a Lua connection triggers SQLite’s cleanup routine for its instance, unaware of connections from the C++ instance.
- The WAL file is deleted if the Lua instance believes it is closing the last connection, even if the C++ instance still has an active handle.
This explains why the WAL file vanishes despite the C++ layer executing subsequent inserts: The C++ connection exists in a separate SQLite instance, which does not recognize the Lua instance’s closure.
2. Android-Specific File Visibility and Caching
Android’s file system caching mechanisms (particularly on ext4 partitions) may delay the visibility of file changes to user-space tools like Device File Explorer. However, this does not fully account for the observed behavior because:
- The WAL file disappearance is persistent across file system inspections (including manual file copying).
- Data integrity checks confirm that the WAL file is physically deleted, not just hidden from view.
3. Implicit Checkpointing on Connection Closure
SQLite automatically performs a checkpoint when the last connection to a database is closed. In multi-instance scenarios:
- The Lua instance’s connection closure triggers a checkpoint (merging WAL contents into the main database).
- The C++ instance, unaware of this checkpoint, continues operating on a WAL file that has been logically emptied.
This results in an apparent contradiction: The WAL file is empty or missing, yet the C++ layer can still write data because it operates on a separate WAL context managed by its SQLite instance.
Troubleshooting Steps, Solutions & Fixes: Resolving WAL File Instability
Step 1: Validate SQLite Library Unification
Objective: Ensure all components use the same SQLite library instance.
Audit Library Dependencies
- For C++/NDK: Verify that the SQLite amalgamation (
sqlite3.c
/sqlite3.h
) is statically linked and not relying on system-provided libraries. - For Lua: Confirm that Lua’s SQLite bindings (e.g.,
lua-sqlite3
) are linked against the same SQLite version as the C++ layer.
Example NDK build configuration:
add_library(sqlite3 STATIC sqlite3.c) target_link_libraries(myapp sqlite3 ...)
- For C++/NDK: Verify that the SQLite amalgamation (
Enforce Version Consistency
- Embed a specific SQLite version (e.g., 3.27.2) in both C++ and Lua components.
- At runtime, query the SQLite version from both layers:
// C++ printf("SQLite version: %s\n", sqlite3_libversion());
-- Lua print(db:exec("SELECT sqlite_version()"))
Eliminate Duplicate Initializations
- If Lua initializes its own SQLite context, replace it with a shared handle from the C++ layer. For LuaJIT FFI:
local ffi = require("ffi") ffi.cdef[[ sqlite3* sqlite3_get_handle_from_cpp(); // Implemented in C++ ]] local handle = ffi.C.sqlite3_get_handle_from_cpp()
- If Lua initializes its own SQLite context, replace it with a shared handle from the C++ layer. For LuaJIT FFI:
Step 2: Refine Connection Lifecycle Management
Objective: Prevent premature checkpointing due to connection closures.
Explicit Connection Pooling
- Maintain a single global connection in the C++ layer, reused across threads with
SQLITE_OPEN_FULLMUTEX
. - Extend connection lifetime to match the application lifecycle (open on startup, close on termination).
// C++ singleton sqlite3* get_global_connection() { static sqlite3* handle = nullptr; if (!handle) { sqlite3_open_v2("main.db", &handle, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | SQLITE_OPEN_FULLMUTEX, nullptr); sqlite3_exec(handle, "PRAGMA journal_mode=WAL;", nullptr, nullptr, nullptr); } return handle; }
- Maintain a single global connection in the C++ layer, reused across threads with
Lua Script Connection Sharing
- Modify Lua scripts to use the existing C++ connection instead of opening new ones. For Lua-SQLite3:
local db = sqlite3.open_ptr(ffi.cast("sqlite3*", get_global_connection()))
- Modify Lua scripts to use the existing C++ connection instead of opening new ones. For Lua-SQLite3:
Disable Implicit Checkpointing
- Set
PRAGMA wal_autocheckpoint=0;
to prevent automatic checkpointing on connection closure. Combine with manual checkpoints during low-activity periods.
sqlite3_exec(handle, "PRAGMA wal_autocheckpoint=0;", nullptr, nullptr, nullptr);
- Set
Step 3: Diagnose File System Interactions
Objective: Rule out Android-specific file system anomalies.
Direct File Monitoring
- Use
inotify
on the database directory to log file creation/deletion events:#include <sys/inotify.h> void monitor_wal() { int fd = inotify_init(); inotify_add_watch(fd, "/data/data/com.package.my/files", IN_CREATE | IN_DELETE); // Log events in a background thread }
- Use
Synchronous File Operations
- Ensure
PRAGMA synchronous=FULL;
is set to force immediate writes to storage, bypassing opportunistic caching.
- Ensure
VFS Configuration
- Use the
unix-excl
VFS to enforce advisory locking compatible with multi-threaded access:sqlite3_open_v2("main.db", &handle, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE, "unix-excl");
- Use the
Step 4: Transaction and Threading Hygiene
Objective: Elimulate race conditions and transaction leaks.
Explicit Transaction Timeouts
- Implement retry logic with exponential backoff for
SQLITE_BUSY
errors:int execute_with_retry(sqlite3* db, const char* sql) { int rc; do { rc = sqlite3_exec(db, sql, nullptr, nullptr, nullptr); if (rc == SQLITE_BUSY) usleep(100000); // 100ms } while (rc == SQLITE_BUSY); return rc; }
- Implement retry logic with exponential backoff for
Thread Affinity for Connections
- Bind database connections to specific threads using thread-local storage:
thread_local sqlite3* local_handle = nullptr; void init_thread_connection() { if (!local_handle) { sqlite3_open_v2("main.db", &local_handle, SQLITE_OPEN_READWRITE | SQLITE_OPEN_FULLMUTEX, nullptr); } }
- Bind database connections to specific threads using thread-local storage:
Cursor Finalization
- Ensure all prepared statements (
sqlite3_stmt
) are finalized before closing connections:-- Lua stmt:finalize() -- Mandatory even if garbage-collected
- Ensure all prepared statements (
Step 5: Advanced Debugging Instrumentation
Objective: Capture granular SQLite state changes.
WAL Hook Tracing
- Extend the
sqlite3_wal_hook
to log checkpoint events and WAL size changes:int wal_hook(void* userdata, sqlite3* db, const char* dbname, int pages) { printf("WAL Hook: %s pages=%d\n", dbname, pages); return SQLITE_OK; } sqlite3_wal_hook(handle, wal_hook, nullptr);
- Extend the
Connection State Monitoring
- Track open/close events using
sqlite3_trace_v2
:void trace_callback(unsigned mask, void* ctx, void* p, void* x) { if (mask & SQLITE_TRACE_STMT) { printf("Statement: %s\n", sqlite3_expanded_sql((sqlite3_stmt*)p)); } } sqlite3_trace_v2(handle, SQLITE_TRACE_STMT, trace_callback, nullptr);
- Track open/close events using
File Descriptor Leak Detection
- Use
lsof
on rooted Android devices to verify open handles to the database and WAL files:adb shell lsof /data/data/com.package.my/files/main.db
- Use
By systematically addressing library unification, connection lifecycle management, and Android-specific file system behaviors, developers can stabilize WAL file persistence while maintaining the performance benefits of WAL mode. The critical insight is that SQLite’s internal state tracking is per-library-instance, necessitating rigorous control over dependency management in multi-component applications.