Data Corruption in SQLite After ALTER TABLE DROP COLUMN: Causes and Fixes
Understanding Data Misalignment During Column Removal in SQLite
Issue Overview: Column Drop Operation Causes Incorrect Data Values in Remaining Column
The core issue revolves around unexpected data corruption occurring when using the ALTER TABLE DROP COLUMN
command in SQLite versions prior to 3.35.5 and 3.36.0. The problem manifests when a column is dropped from a table containing an INTEGER PRIMARY KEY
column, resulting in values from the dropped column erroneously appearing in the remaining column for a subset of rows. This behavior is non-deterministic and depends on factors such as the size of stored data, the number of rows, and internal page-level optimizations.
Key Observations from the Problem Scenario
Reproductive Test Case:
- A table
MyTable
is created with three columns:ID
(INTEGER PRIMARY KEY),Foo
, andBar
. - 50,000 identical rows are inserted, where
Foo=123
andBar=456
. - After dropping the
Foo
column, a subset of rows (e.g., 17,226 out of 50,000) showBar=123
instead of the expected456
.
- A table
Data Size Dependency:
- When the dropped column (
Foo
) and retained column (Bar
) store small integers (e.g., 1 and 2), corruption occurs. - Larger integers (e.g., 123456789 and 987654321) do not trigger the issue.
- This suggests a link between data representation size and SQLite’s internal storage mechanisms.
- When the dropped column (
Row Count Threshold:
- Even with smaller datasets (e.g., 500 rows), corruption occurs beyond a specific row index (e.g., row 135 in a 500-row table).
- The corruption pattern is consistent: values from the dropped column "spill over" into the retained column starting at a specific row.
Impact on Schema Integrity:
- The corruption violates relational integrity, as the retained column’s values no longer reflect the original data.
- The issue is not detectable through schema validation tools, as the table structure itself is valid post-operation.
Underlying Mechanism of the Bug
The corruption arises from SQLite’s method of handling ALTER TABLE DROP COLUMN
operations. Internally, SQLite creates a new table with the revised schema, copies data from the old table to the new one, and then drops the old table. However, optimizations for tables with INTEGER PRIMARY KEY
columns (which alias the rowid) lead to incomplete data copying when columns are dropped. Specifically:
- SQLite avoids rewriting the rowid, assuming its stability.
- When columns are dropped, the remaining columns’ offsets in stored records shift. If the old and new schemas have different numbers of columns, this offset miscalculation causes misaligned data reads.
- Small integers are stored inline within database pages, exacerbating alignment errors due to fixed-size encoding. Larger integers may use variable-length storage or overflow pages, avoiding the issue.
Root Causes: Storage Engine Optimizations and Schema Migration Edge Cases
The corruption is rooted in SQLite’s storage engine and schema migration logic. Below are the technical factors contributing to the problem:
Rowid Optimization for INTEGER PRIMARY KEY Tables
- Tables with an
INTEGER PRIMARY KEY
column alias the rowid, a 64-bit signed integer key used for internal storage. - During
ALTER TABLE
operations, SQLite optimizes data migration by preserving the rowid, assuming its immutability. - Failure Point: This optimization skips re-encoding the entire row, leading to incorrect column offset calculations when columns are dropped.
- Tables with an
Record Encoding and Column Offsets
- SQLite stores table rows as binary records, with each column’s value encoded sequentially.
- When a column is dropped, the retained columns’ positions in the record change. For example, dropping the second column (
Foo
) shifts the third column (Bar
) to the second position. - Failure Point: The migration logic fails to adjust column offsets correctly for existing records, causing the retained column (
Bar
) to read data from the dropped column’s (Foo
) storage location.
Data Type and Storage Size Sensitivity
- Small integers (e.g., 1, 123) are stored as 1-byte or 4-byte inline values.
- Larger integers (e.g., 123456789) may exceed inline storage thresholds, triggering variable-length encoding or overflow pages.
- Failure Point: Inline storage of small integers creates fixed-size records, amplifying offset misalignment. Variable-length encoding introduces padding or length prefixes that mitigate alignment errors.
Page-Level Data Organization
- SQLite organizes data into fixed-size pages (default 4KB). Records are packed into pages, with overflow chains for large values.
- Failure Point: When dropping a column, SQLite’s page-level reorganization may incorrectly map old records to the new schema, especially when the old and new records have different sizes.
Lack of Full Record Rewrite
- The
ALTER TABLE
command avoids rewriting entire records for efficiency. Instead, it relies on column offset adjustments. - Failure Point: This optimization assumes column positions are stable, which fails when columns are removed.
- The
Resolving the Issue: Upgrades, Workarounds, and Mitigation Strategies
Step 1: Upgrade to a Patched SQLite Version
The bug was resolved in SQLite 3.35.5 (backport) and 3.36.0 (official release). Verify your SQLite version using:
SELECT sqlite_version();
If the version is older than 3.35.5, upgrade using one of the following methods:
- Official Precompiled Binaries: Download from SQLite Downloads.
- Recompile from Source: Integrate the fix from check-in 354a4db5cb769c6a.
Step 2: Manual Schema Recreation (For Unpatched Versions)
If upgrading is not feasible, manually recreate the table:
Create a New Table
CREATE TABLE MyTable_new ( ID INTEGER PRIMARY KEY NOT NULL, Bar INTEGER );
Copy Data with Explicit Column Mapping
INSERT INTO MyTable_new (ID, Bar) SELECT ID, Bar FROM MyTable;
Drop the Old Table and Rename the New One
DROP TABLE MyTable; ALTER TABLE MyTable_new RENAME TO MyTable;
Recreate Indexes and Triggers
Use.schema
in the SQLite shell to dump existing indexes/triggers, adjust them for the new schema, and reapply.
Step 3: Data Validation Post-Migration
After dropping a column or recreating a table, validate data integrity:
-- Check for unexpected values in the retained column
SELECT COUNT(*) FROM MyTable WHERE Bar NOT IN (456, ...);
Step 4: Avoid INTEGER PRIMARY KEY When Possible
If your table does not require an INTEGER PRIMARY KEY
, use a WITHOUT ROWID
table or a composite primary key to bypass rowid-related optimizations:
CREATE TABLE MyTable (
ID INTEGER NOT NULL,
Foo INTEGER,
Bar INTEGER,
PRIMARY KEY (ID)
) WITHOUT ROWID;
Step 5: Use Larger Data Values as Temporary Mitigation
If immediate fixes are unavailable, store values in the affected columns as larger integers or strings to force variable-length encoding:
-- Add 1000000000 to small integers to increase their storage size
UPDATE MyTable SET Foo = Foo + 1000000000, Bar = Bar + 1000000000;
Step 6: Enable Defensive Database Settings
Configure SQLite to use stricter data integrity checks:
PRAGMA integrity_check;
PRAGMA quick_check;
Run these periodically after schema changes.
Step 7: Leverage Backup and Restore Strategies
Before performing schema migrations, create a backup:
sqlite3 mydb.db ".backup mydb.backup"
If corruption occurs, restore from the backup and retry with patched SQLite.
Step 8: Monitor SQLite Release Notes
Subscribe to SQLite’s release announcements to stay informed about critical fixes.
By understanding the interplay between SQLite’s storage engine optimizations and schema migration logic, developers can preemptively avoid this class of bugs and ensure robust data integrity during schema changes.