Assertion Failure in sqlite3ExprAffinity Due to Collation and Index Optimization


Collision Between Indexed Collation Expressions and Affinity Assertion Checks

Root Cause: Expression Tree Validation During Affinity Determination

The assertion failure in sqlite3ExprAffinity arises when the SQLite query compiler processes an expression tree that violates assumptions about valid node types. This occurs specifically when:

  1. An index is defined with an explicit collation sequence (e.g., COLLATE NOCASE) applied to an expression (e.g., +c0 COLLATE NOCASE).
  2. The query optimizer attempts to reuse the indexed expression in a context where affinity derivation is required (e.g., GROUP BY clauses or aggregate functions).
  3. The internal representation of the indexed expression does not match the expected node types (TK_COLLATE, TK_IF_NULL_ROW, or a register variant) when affinity is computed.

The sqlite3ExprAffinity function enforces strict rules about the structure of expression nodes. When these rules are violated due to an incorrectly transformed expression tree, the assertion triggers a fatal error. The crux of the issue lies in the interaction between indexed expressions with collations and optimizer decisions that alter the expression tree during query planning.


Critical Factors Leading to Affinity Assertion Violations

1. Indexed Expressions with Collation Modifiers

SQLite allows collation sequences to be applied directly to expressions within index definitions. For example:

CREATE INDEX i ON v0 (+c0 COLLATE NOCASE);

Here, the unary + operator prevents the expression from being optimized away, forcing SQLite to treat c0 COLLATE NOCASE as an indexed expression. When the query planner later attempts to reuse this indexed expression in a different context (e.g., a SELECT list or GROUP BY clause), it may incorrectly propagate or omit collation information during expression tree transformations.

2. Aggressive Query Optimizer Transformations

The SQLite optimizer employs heuristic rewrites to reuse indexed expressions, reducing computational overhead. When the SQLITE_IndexedExpr optimization is enabled (default), the optimizer substitutes raw column references with their indexed expression counterparts where possible. However, collation modifiers complicate this substitution:

  • Collation information is stored in the pColl field of Expr nodes.
  • If the optimizer fails to propagate this field during substitution, the resulting expression node may lack collation metadata, violating internal invariants.

3. Grouping and Aggregation Contexts

The GROUP BY clause in the problematic query (GROUP BY SUBSTR(0,0)) introduces a grouping key derived from a constant (since SUBSTR(0,0) evaluates to an empty string). This forces SQLite to compute affinity for all expressions in the SELECT list relative to the grouping key. When the indexed expression +c0 COLLATE NOCASE is reused here, the absence of collation metadata during affinity calculation triggers the assertion failure.


Resolution: Validating and Correcting Expression Tree Transformations

Step 1: Disabling the IndexedExpr Optimization (Temporary Workaround)

Disabling the SQLITE_IndexedExpr optimization prevents the query planner from substituting indexed expressions into invalid contexts. Execute this before running the query:

.testctrl optimizations 0x01000000;  -- Disables IndexedExpr (bit 24)

This skips the problematic substitution, avoiding the assertion. However, this is a temporary fix and may degrade performance for other queries relying on this optimization.

Step 2: Patching Expression Tree Validation Logic

The underlying bug stems from incomplete collation handling during expression substitution. The fix involves:

  1. Collation Propagation: Ensuring collation metadata (pColl) is copied when substituting indexed expressions into new contexts.
  2. Node Type Sanity Checks: Verifying that expression nodes derived from indexed expressions retain valid types (TK_COLLATE, TK_IF_NULL_ROW) before affinity computation.

The commit cf6454ce26983b9c addresses this by refining collation propagation rules in the optimizer. Apply this patch or upgrade to a SQLite version containing it.

Step 3: Query Restructuring to Avoid Edge Cases

Rewrite the query to decouple collation modifiers from indexed expressions in aggregation contexts:

SELECT COUNT((SELECT 1 WHERE c0 COLLATE NOCASE = 1)) FROM v0 GROUP BY '';  

This eliminates the implicit reuse of the indexed expression +c0 COLLATE NOCASE in the GROUP BY context, sidestepping the assertion.


By addressing collation propagation in optimizer transformations and avoiding ambiguous expression reuse, the assertion failure is resolved. Implement these fixes preemptively in schemas leveraging expression indexes with collations.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *