Unexpected NULL in Changeset When Updating Column with DEFAULT After ALTER TABLE


Understanding the Behavior of DEFAULT Values and Changeset Generation

The core issue revolves around how SQLite handles column additions with DEFAULT values via ALTER TABLE and how changesets generated afterward represent the "old" values of modified columns. Specifically, when a column with a DEFAULT clause is added to a table, existing rows do not immediately receive the default value in their stored data. Instead, SQLite dynamically applies the default at query time unless the column is explicitly written to. This behavior leads to discrepancies when generating changesets (e.g., via the sqlite3session extension or manual logging), where the "old" value of the column in an UPDATE operation appears as NULL instead of the expected default value. This guide will dissect the problem, explore its causes, and provide actionable solutions.


Root Cause: Stored NULL vs. Dynamic DEFAULT Application

1. How SQLite Stores Columns Added via ALTER TABLE

When a new column is added to a table using ALTER TABLE ADD COLUMN, SQLite modifies the table schema but does not rewrite existing rows to populate the new column. Instead, it leaves the new column’s value as NULL for all pre-existing rows. This is a performance optimization to avoid costly table rewrites. However, if the column has a DEFAULT clause, SQLite’s query engine dynamically substitutes NULL values with the default when the column is read. This substitution is virtual—it does not alter the stored data.

For example:

CREATE TABLE items (id INTEGER PRIMARY KEY, name TEXT NOT NULL);
INSERT INTO items VALUES (1, 'Item A');
ALTER TABLE items ADD COLUMN price INTEGER DEFAULT 0;
  • The price column for the existing row id=1 is stored as NULL.
  • A SELECT price FROM items returns 0 because the default is applied at read time.

2. Changeset Generation and the "Old" Value

Changesets (used for tracking data modifications) record the state of a row before an UPDATE or DELETE operation. When generating a changeset for an UPDATE like:

UPDATE items SET price = 10 WHERE id = 1;

the "old" value of price is derived from the stored data, not the dynamically computed default. Since the stored value is NULL, the changeset reflects NULL as the old value, even though queries return 0.

3. The Expectation Gap

Users expect the changeset to reflect the logical state of the data (i.e., the default 0), not the physical storage (NULL). This mismatch arises because SQLite’s changeset mechanism does not account for column defaults when capturing pre-modification values. The behavior is consistent with SQLite’s documentation but counterintuitive for those assuming that defaults are materialized.


Key Factors Contributing to the Issue

1. Delayed Materialization of DEFAULT Values

SQLite defers the application of DEFAULT values for columns added via ALTER TABLE. The default is only materialized when:

  • A new row is inserted after the ALTER TABLE.
  • An existing row is explicitly updated to set the column’s value.

Until then, the column’s value remains NULL in storage, with the default applied transiently during queries. Changeset generation tools like sqlite3session access the stored NULL directly, bypassing the default logic.

2. Changeset Design Philosophy

SQLite’s changeset mechanism prioritizes accuracy to on-disk state over user-facing abstractions. It records the literal bytes stored in the database file, ignoring computed values like defaults or generated columns. This ensures consistency with rollback journals and Write-Ahead Logging (WAL) but creates confusion when defaults are involved.

3. Ambiguity in DEFAULT Handling Across Operations

The behavior differs depending on how a column is added:

  • CREATE TABLE: Defaults are materialized immediately for all rows.
  • ALTER TABLE: Defaults are virtual until explicitly written.

This inconsistency is not well-documented in the context of changesets, leading to unexpected outcomes.


Resolving the Discrepancy: Workarounds and Fixes

1. Forcing Materialization of DEFAULT Values

To ensure the default value is stored (and thus appears correctly in changesets), explicitly update existing rows after adding the column:

ALTER TABLE items ADD COLUMN price INTEGER DEFAULT 0;
UPDATE items SET price = DEFAULT;

This writes the default value (0) into the price column for all rows, replacing the stored NULL. Subsequent changesets will show 0 as the "old" value.

Trade-offs:

  • Pros: Simple, ensures consistency between queries and changesets.
  • Cons: Requires a full table scan, which may be expensive for large tables.

2. Modifying Changeset Generation Logic (Advanced)

If you control the changeset generation process (e.g., using triggers or application code), intercept the "old" value and substitute NULL with the column’s default. This requires querying the schema to retrieve defaults:

-- Retrieve the default value for the 'price' column
SELECT dflt_value FROM pragma_table_info('items') WHERE name = 'price';

Use this value to replace NULL in the changeset’s "old" data.

Trade-offs:

  • Pros: Avoids modifying stored data.
  • Cons: Increases complexity; may not integrate well with built-in tools like sqlite3session.

3. Applying the Official SQLite Patch

The SQLite team addressed this issue in commit 6a8c687904e92f00. The patch updates the changeset generator to consider column defaults when recording "old" values. To leverage this fix:

  1. Download a patched SQLite build from the official repository.
  2. Recompile your application against the updated library.

Verification Steps:

  1. Reproduce the original scenario with the patched build.
  2. Generate a changeset after the UPDATE operation.
  3. Confirm the "old" value for price is 0 instead of NULL.

Caveats:

  • Ensure compatibility with existing databases and tools.
  • Test for regressions in other changeset-related operations.

4. Schema Design Best Practices

Avoid relying on ALTER TABLE ADD COLUMN with DEFAULT for critical workflows involving changesets. Instead:

  • Predefine columns: Include all likely columns at table creation, even if unused initially.
  • Use views or triggers: Create a view that materializes defaults, and perform updates through the view.
    Example:
CREATE VIEW items_view AS
SELECT id, name, COALESCE(price, 0) AS price FROM items;

-- Use INSTEAD OF triggers to handle updates
CREATE TRIGGER items_view_update INSTEAD OF UPDATE ON items_view
BEGIN
  UPDATE items SET price = NEW.price WHERE id = OLD.id;
END;

Trade-offs:

  • Pros: Decouples storage from application logic.
  • Cons: Adds maintenance overhead for views and triggers.

By understanding the interplay between SQLite’s storage model, default values, and changeset generation, developers can choose the most appropriate strategy for their use case—whether that’s modifying data storage, adjusting application logic, or applying upstream fixes.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *