Assertion Failure in recoverAddTable When Defining Duplicate Column in Primary Key With Collation Conflict
Issue Overview: Duplicate Column in WITHOUT ROWID Primary Key Triggers Assertion During Schema Recovery
The core issue arises when attempting to create a WITHOUT ROWID table in SQLite with a composite primary key that includes the same column twice but with conflicting collation sequences. The specific error manifests as an assertion failure in the recoverAddTable
function during schema recovery operations. This failure occurs because SQLite’s internal logic for rebuilding table schemas encounters an inconsistency between column indices and declared constraints when processing duplicate column names in primary keys with differing collations.
When a WITHOUT ROWID table is created, SQLite implicitly converts the primary key definition into a clustered index. This index must adhere to stricter constraints compared to ordinary indexes, including unique column identification. The assertion iField < pNew->nCol && iCol < pNew->nCol
in recoverAddTable
fails because the schema recovery logic miscomputes column indices when reconstructing a table definition where a single physical column appears multiple times in the primary key with different collations. This creates a mismatch between the number of declared columns in the index (derived from the primary key) and the actual columns available in the table schema.
The error is triggered specifically when SQLite’s writable_schema pragma is enabled, which allows direct modification of the sqlite_schema
table. This pragma bypasses normal schema validation checks, deferring integrity verification until the schema is reloaded. When combined with the sqlite3_recover module (used for rebuilding corrupted databases), the duplicate column entries in the primary key cause an array bounds violation during schema reconstruction. The root cause lies in how SQLite handles column identity in composite primary keys when collation differences create logical duplicates of the same physical column.
Possible Causes: Collation Conflicts and Column Identity Mismanagement in Composite Primary Keys
1. Collation-Induced Logical Duplicates in Primary Key Definitions
SQLite allows columns in indexes (including implicit primary key indexes for WITHOUT ROWID tables) to specify collation sequences. When a column appears multiple times in a composite primary key with different collations, SQLite treats each occurrence as a distinct logical entry in the index. However, the physical column count for the table remains unchanged. For example, the declaration PRIMARY KEY(a, a COLLATE NOCASE)
creates an index with two entries referencing the same physical column a
but with different collations. This creates an inconsistency: the index expects two distinct columns, but the table schema only has one. The recoverAddTable
function, which rebuilds table definitions during recovery, fails to reconcile this discrepancy, leading to an array index overflow.
2. Schema Recovery Logic Assumes Unique Column Indices
The sqlite3_recover
module processes the sqlite_schema
table to reconstruct database objects. When encountering a WITHOUT ROWID table, it parses the primary key definition to build an equivalent index. The recovery logic assumes that each column referenced in the index corresponds to a distinct physical column in the table. When a column is duplicated with a different collation, the recovery code erroneously increments column index counters (iField
and iCol
) beyond the actual number of columns in the table (pNew->nCol
). This violates the assertion that ensures array indices remain within bounds.
3. Writable_Schema Pragma Bypasses Early Validation
Normally, SQLite validates schema changes during parsing and execution. However, enabling PRAGMA writable_schema = on
disables these checks, allowing invalid schema entries to be written directly to the sqlite_schema
table. The invalid primary key definition (with duplicate column entries) is persisted without immediate validation. The error surfaces later during schema reloading or recovery operations, when SQLite reconstructs the schema from the sqlite_schema
table and encounters the malformed index definition.
Troubleshooting Steps, Solutions & Fixes: Resolving Collation Conflicts in Primary Keys and Schema Recovery
Step 1: Validate Primary Key Definitions for Column Uniqueness
Avoid using the same physical column multiple times in composite primary keys, especially with differing collations. Modify the table definition to ensure all primary key columns refer to distinct physical columns:
-- Invalid (triggers assertion):
CREATE TABLE t(a, PRIMARY KEY(a, a COLLATE NOCASE)) WITHOUT ROWID;
-- Valid alternative:
CREATE TABLE t(a, b COLLATE NOCASE, PRIMARY KEY(a, b)) WITHOUT ROWID;
If the use case requires indexing the same column with multiple collations, create separate indexes explicitly rather than embedding them in the primary key:
CREATE TABLE t(a) WITHOUT ROWID;
CREATE INDEX t_a_nocase ON t(a COLLATE NOCASE);
Step 2: Upgrade to SQLite 3.41.1 or Later
The assertion failure was patched in commit 8a1ad137606ce5bd. This fix ensures that the recovery logic properly handles duplicate column references in primary keys by validating column indices before accessing them. Verify your SQLite version:
sqlite3 --version
If the version predates 3.41.1 (released 2023-04-20), download the latest amalgamation from sqlite.org/download and recompile your application.
Step 3: Disable Writable_Schema During Schema Modifications
Avoid using PRAGMA writable_schema = on
unless absolutely necessary. If you must modify the schema directly, re-enable schema validation immediately afterward:
PRAGMA writable_schema = on;
-- Perform schema modifications --
PRAGMA writable_schema = off;
PRAGMA integrity_check;
The integrity_check
pragma forces SQLite to validate the schema and will flag invalid primary key definitions early, preventing latent errors during recovery.
Step 4: Audit Schema Recovery Workflows
If you use the sqlite3_recover
module or similar tools, ensure they are updated to handle edge cases involving collation conflicts. Test recovery procedures on a copy of the database before applying them to production data. For databases created with older SQLite versions, manually inspect WITHOUT ROWID
table definitions for duplicate primary key columns and refactor them as needed.
Step 5: Enable Debugging Symbols for Detailed Crash Analysis
Compile SQLite with debugging symbols to obtain detailed stack traces when assertions fail:
export CFLAGS="-g -DSQLITE_DEBUG"
./configure
make
Reproduce the issue and use a debugger (e.g., gdb
) to inspect variable states at the point of failure:
gdb --args ./sqlite3 test.db
(gdb) run < poc.sql
(gdb) backtrace
Examine the values of pNew->nCol
, iField
, and iCol
in the recoverAddTable
function to confirm the index mismatch.
Step 6: Refactor Legacy Schemas to Avoid Collation Duplicates
For existing databases with problematic primary keys, create a new table with a corrected schema and migrate data:
-- Original table (problematic):
CREATE TABLE t(a, PRIMARY KEY(a, a COLLATE NOCASE)) WITHOUT ROWID;
-- Replacement table:
CREATE TABLE t_new(a PRIMARY KEY, a_nocase COLLATE NOCASE) WITHOUT ROWID;
INSERT INTO t_new SELECT a, a FROM t;
DROP TABLE t;
ALTER TABLE t_new RENAME TO t;
This ensures the primary key references distinct columns while preserving the collation requirements.
Step 7: Implement Custom Collation Sequences for Complex Sorting
If the original intent was to sort the primary key with multiple collations, define a custom collation sequence that combines the desired behaviors:
sqlite3_create_collation(db, "MIXED_COLLATION", SQLITE_UTF8, NULL, mixedCollationFunc);
CREATE TABLE t(a, PRIMARY KEY(a COLLATE MIXED_COLLATION)) WITHOUT ROWID;
This avoids duplicating columns while achieving the required sorting logic.
Step 8: Monitor Schema Changes with SQLite’s PRAGMA Functions
Use PRAGMA schema_version
and PRAGMA foreign_key_check
after schema modifications to detect inconsistencies early. Automated testing frameworks should include schema validation steps to catch similar issues during development.