Unexpected NULL in Changeset When Updating Column with DEFAULT After ALTER TABLE
Understanding the Behavior of DEFAULT Values and Changeset Generation
The core issue revolves around how SQLite handles column additions with DEFAULT
values via ALTER TABLE
and how changesets generated afterward represent the "old" values of modified columns. Specifically, when a column with a DEFAULT
clause is added to a table, existing rows do not immediately receive the default value in their stored data. Instead, SQLite dynamically applies the default at query time unless the column is explicitly written to. This behavior leads to discrepancies when generating changesets (e.g., via the sqlite3session
extension or manual logging), where the "old" value of the column in an UPDATE
operation appears as NULL
instead of the expected default value. This guide will dissect the problem, explore its causes, and provide actionable solutions.
Root Cause: Stored NULL vs. Dynamic DEFAULT Application
1. How SQLite Stores Columns Added via ALTER TABLE
When a new column is added to a table using ALTER TABLE ADD COLUMN
, SQLite modifies the table schema but does not rewrite existing rows to populate the new column. Instead, it leaves the new column’s value as NULL
for all pre-existing rows. This is a performance optimization to avoid costly table rewrites. However, if the column has a DEFAULT
clause, SQLite’s query engine dynamically substitutes NULL
values with the default when the column is read. This substitution is virtual—it does not alter the stored data.
For example:
CREATE TABLE items (id INTEGER PRIMARY KEY, name TEXT NOT NULL);
INSERT INTO items VALUES (1, 'Item A');
ALTER TABLE items ADD COLUMN price INTEGER DEFAULT 0;
- The
price
column for the existing rowid=1
is stored asNULL
. - A
SELECT price FROM items
returns0
because the default is applied at read time.
2. Changeset Generation and the "Old" Value
Changesets (used for tracking data modifications) record the state of a row before an UPDATE
or DELETE
operation. When generating a changeset for an UPDATE
like:
UPDATE items SET price = 10 WHERE id = 1;
the "old" value of price
is derived from the stored data, not the dynamically computed default. Since the stored value is NULL
, the changeset reflects NULL
as the old value, even though queries return 0
.
3. The Expectation Gap
Users expect the changeset to reflect the logical state of the data (i.e., the default 0
), not the physical storage (NULL
). This mismatch arises because SQLite’s changeset mechanism does not account for column defaults when capturing pre-modification values. The behavior is consistent with SQLite’s documentation but counterintuitive for those assuming that defaults are materialized.
Key Factors Contributing to the Issue
1. Delayed Materialization of DEFAULT Values
SQLite defers the application of DEFAULT
values for columns added via ALTER TABLE
. The default is only materialized when:
- A new row is inserted after the
ALTER TABLE
. - An existing row is explicitly updated to set the column’s value.
Until then, the column’s value remains NULL
in storage, with the default applied transiently during queries. Changeset generation tools like sqlite3session
access the stored NULL
directly, bypassing the default logic.
2. Changeset Design Philosophy
SQLite’s changeset mechanism prioritizes accuracy to on-disk state over user-facing abstractions. It records the literal bytes stored in the database file, ignoring computed values like defaults or generated columns. This ensures consistency with rollback journals and Write-Ahead Logging (WAL) but creates confusion when defaults are involved.
3. Ambiguity in DEFAULT Handling Across Operations
The behavior differs depending on how a column is added:
- CREATE TABLE: Defaults are materialized immediately for all rows.
- ALTER TABLE: Defaults are virtual until explicitly written.
This inconsistency is not well-documented in the context of changesets, leading to unexpected outcomes.
Resolving the Discrepancy: Workarounds and Fixes
1. Forcing Materialization of DEFAULT Values
To ensure the default value is stored (and thus appears correctly in changesets), explicitly update existing rows after adding the column:
ALTER TABLE items ADD COLUMN price INTEGER DEFAULT 0;
UPDATE items SET price = DEFAULT;
This writes the default value (0
) into the price
column for all rows, replacing the stored NULL
. Subsequent changesets will show 0
as the "old" value.
Trade-offs:
- Pros: Simple, ensures consistency between queries and changesets.
- Cons: Requires a full table scan, which may be expensive for large tables.
2. Modifying Changeset Generation Logic (Advanced)
If you control the changeset generation process (e.g., using triggers or application code), intercept the "old" value and substitute NULL
with the column’s default. This requires querying the schema to retrieve defaults:
-- Retrieve the default value for the 'price' column
SELECT dflt_value FROM pragma_table_info('items') WHERE name = 'price';
Use this value to replace NULL
in the changeset’s "old" data.
Trade-offs:
- Pros: Avoids modifying stored data.
- Cons: Increases complexity; may not integrate well with built-in tools like
sqlite3session
.
3. Applying the Official SQLite Patch
The SQLite team addressed this issue in commit 6a8c687904e92f00. The patch updates the changeset generator to consider column defaults when recording "old" values. To leverage this fix:
- Download a patched SQLite build from the official repository.
- Recompile your application against the updated library.
Verification Steps:
- Reproduce the original scenario with the patched build.
- Generate a changeset after the
UPDATE
operation. - Confirm the "old" value for
price
is0
instead ofNULL
.
Caveats:
- Ensure compatibility with existing databases and tools.
- Test for regressions in other changeset-related operations.
4. Schema Design Best Practices
Avoid relying on ALTER TABLE ADD COLUMN
with DEFAULT
for critical workflows involving changesets. Instead:
- Predefine columns: Include all likely columns at table creation, even if unused initially.
- Use views or triggers: Create a view that materializes defaults, and perform updates through the view.
Example:
CREATE VIEW items_view AS
SELECT id, name, COALESCE(price, 0) AS price FROM items;
-- Use INSTEAD OF triggers to handle updates
CREATE TRIGGER items_view_update INSTEAD OF UPDATE ON items_view
BEGIN
UPDATE items SET price = NEW.price WHERE id = OLD.id;
END;
Trade-offs:
- Pros: Decouples storage from application logic.
- Cons: Adds maintenance overhead for views and triggers.
By understanding the interplay between SQLite’s storage model, default values, and changeset generation, developers can choose the most appropriate strategy for their use case—whether that’s modifying data storage, adjusting application logic, or applying upstream fixes.