Stale Column Metadata in SQLite WASM OO1 API exec() After Schema Modifications
Mechanism of Stale Column Metadata Caching in OO1 API exec()
The SQLite WASM OO1 API’s exec()
method is designed to execute SQL statements and process results efficiently. A critical but subtle flaw exists in its handling of prepared statements when the database schema changes after statement preparation but before execution. The exec()
method caches column metadata—such as column names and counts—before executing the statement via sqlite3_step()
. This metadata is stored in properties like columnNames
and columnCount
of the Stmt
object. However, SQLite’s sqlite3_step()
function may automatically recompile a prepared statement if the database schema changes during execution. When this recompilation occurs, the schema-dependent metadata cached by exec()
becomes stale, leading to inconsistencies between the actual query results and the reported column information.
For example, consider a web application that dynamically replaces a table named data
with a new version containing different columns. A SELECT * FROM data
query executed via exec()
after the schema change might return rows with the new columns but report the original column names cached during the initial preparation phase. This discrepancy arises because the column metadata was captured before the first call to sqlite3_step()
, which triggers recompilation. The cached metadata is not refreshed after recompilation, causing the application to process data with incorrect schema information. This behavior is particularly problematic in environments with multiple database connections or where schema changes are frequent, such as interactive data exploration tools or multi-user web applications.
The core issue stems from the timing of metadata extraction relative to statement execution. The OO1 API’s exec()
method prepares a statement, immediately fetches column metadata, and then begins stepping through the results. If the schema changes between preparation and stepping (e.g., due to actions on another connection), the recompiled statement’s metadata diverges from the cached values. This violates the expectation that columnNames
and columnCount
reflect the actual structure of the result set.
Schema-Dependent Metadata Lifecycle and Recompilation Triggers
1. Premature Metadata Caching During Statement Preparation
The OO1 API’s exec()
method initializes a Stmt
object by preparing a SQL statement and immediately querying column metadata via sqlite3_column_name()
and sqlite3_column_count()
. These values are stored in the Stmt.columnNames
array and Stmt.columnCount
property. However, at this stage, the statement has not yet been executed (sqlite3_step()
), and the schema could change before or during execution. For instance:
// Simplified OO1 API exec() logic
const stmt = db.prepare(sql); // Prepares statement and caches column names/count
stmt.columnNames; // ["old_col1", "old_col2"]
db.step(); // May recompile the statement if schema changed
// stmt.columnNames remains ["old_col1", "old_col2"] even if schema changed
2. Automatic Statement Recompilation During sqlite3_step()
SQLite’s sqlite3_step()
function automatically recompiles a prepared statement if the schema changes after preparation but before execution. This is documented in SQLite’s C API: when a schema change occurs, instead of returning SQLITE_SCHEMA
, sqlite3_step()
transparently recompiles the statement and continues execution. While this ensures forward progress, it invalidates any schema-dependent metadata cached prior to recompilation. For example:
- Connection A prepares
SELECT * FROM tbl
(columns:id, name
). - Connection B alters
tbl
to add a column (email
). - Connection A calls
sqlite3_step()
, triggering recompilation. The result set now includesemail
, butcolumnNames
still reflects["id", "name"]
.
3. Schema Volatility in Multi-Connection or Dynamic Environments
Applications using multiple database connections (e.g., browser tabs with separate connections) or dynamic schema modifications (e.g., replacing tables via CREATE TABLE ... AS SELECT
) are particularly susceptible. The exec()
method’s cached metadata becomes stale if any connection modifies the schema after the target statement is prepared but before it is stepped. This is exacerbated by the use of SELECT *
, which implicitly depends on the current schema at execution time.
4. Metadata Access Before Result Processing
The exec()
method’s columnNames
option populates an array with column names before invoking the row callback. This design assumes the schema remains stable throughout execution. When the schema changes mid-execution, the pre-populated columnNames
array no longer matches the actual columns of the rows passed to the callback. Similarly, the Stmt
object’s cached columnCount
becomes incorrect if the recompiled statement has a different number of columns.
Correcting Metadata Timing and Mitigating Schema Volatility
1. Defer Column Metadata Extraction Until After First Step
Modify the exec()
method to delay fetching column names and counts until after the first successful call to sqlite3_step()
. This ensures that any schema changes triggering recompilation are accounted for before metadata is cached. The revised logic would:
- Prepare the statement.
- Step to execute it (potentially triggering recompilation).
- Fetch column names/count from the recompiled statement.
- Populate
columnNames
and process rows.
Implementation Adjustment:
// Revised exec() logic
const stmt = db.prepare(sql);
let rc = db.step();
if (rc === SQLITE_ROW) {
// Fetch metadata AFTER stepping
const columnNames = [];
for (let i = 0; i < stmt.columnCount; i++) {
columnNames.push(stmt.columnName(i));
}
// Populate columnNames array and invoke callback
}
2. Dynamic Property Access for Schema-Dependent Values
Replace cached properties like Stmt.columnCount
with dynamic accessors that query the underlying statement state on each access. For example, instead of storing columnCount
during preparation, define it as a getter that calls sqlite3_column_count()
:
Object.defineProperty(Stmt.prototype, 'columnCount', {
get() {
return this.ptr ? sqlite3_column_count(this.ptr) : 0;
}
});
This ensures that columnCount
always reflects the current state of the statement, even after recompilation.
3. Handling Statements With No Result Rows
Statements that return no rows (e.g., INSERT
) pose a challenge because sqlite3_step()
returns SQLITE_DONE
immediately, skipping the row-processing phase. To capture metadata in such cases:
- After stepping, check if the statement has result columns via
sqlite3_column_count()
. - If
columnCount > 0
, fetch column names even if no rows are returned.
const stmt = db.prepare("SELECT * FROM non_existent_table");
const rc = db.step();
if (rc === SQLITE_DONE && stmt.columnCount > 0) {
// Schema changed after preparation; fetch fresh column names
const columnNames = [...];
}
4. Transaction Boundaries to Stabilize Schema State
Encourage developers to wrap schema modifications and subsequent queries in transactions. While SQLite’s DEFERRED
transactions do not lock the database, an explicit BEGIN IMMEDIATE
transaction prevents other connections from modifying the schema during execution:
db.exec("BEGIN IMMEDIATE");
// Modify schema and execute queries
db.exec("COMMIT");
This reduces the window during which schema changes can invalidate prepared statements.
5. Avoiding Implicit Schema Dependencies
Discourage the use of SELECT *
in favor of explicit column lists. For example:
-- Instead of:
SELECT * FROM tbl;
-- Use:
SELECT id, name, email FROM tbl;
Explicit column lists decouple the query from schema changes, as the statement will fail with SQLITE_ERROR
if a specified column is removed, rather than silently returning stale metadata.
6. Patch Implementation and Versioning
The SQLite team addressed this issue in the commit oo1-no-cache-Stmt.columnCount
by:
- Removing cached
columnCount
in favor of a dynamic property. - Adjusting
DB.exec()
to fetch column names after the first step.
Developers should upgrade to SQLite versions including this patch (post-3.42) or backport the changes if using a custom build. Verify fixes by testing schema changes mid-query:
// Test case
const db = new sqlite3.oo1.DB();
db.exec("CREATE TABLE t1(a, b); INSERT INTO t1 VALUES(1, 2);");
const results = db.exec("SELECT * FROM t1"); // columnNames = ["a", "b"]
db.exec("ALTER TABLE t1 RENAME COLUMN a TO c;");
const results2 = db.exec("SELECT * FROM t1"); // Post-patch: columnNames = ["c", "b"]