Segmentation Fault in FTS5 GLOB Query Due to Null Pointer Dereference

FTS5 Virtual Table GLOB Query Triggers Null Pointer Dereference in sqlite3Fts5ExprAnd

Issue Context: Faulty Expression Tree Construction During FTS5 GLOB Evaluation

The segmentation fault occurs during execution of a FTS5 virtual table query involving the GLOB operator applied to an integer literal. The crash manifests in sqlite3Fts5ExprAnd, a function responsible for combining FTS5 expression nodes during query evaluation. The root cause lies in invalid expression tree construction when processing the GLOB operator with a non-text right-hand operand.

Key technical relationships:

  1. FTS5 Virtual Table Engine: Handles text search operations through specialized tokenizers (trigram in this case).
  2. GLOB Operator: Implements pattern matching but requires string operands.
  3. Expression Tree Construction: FTS5 builds an internal representation of query logic through nodes like Fts5ExprNode.
  4. sqlite3Fts5ExprAnd: Combines child expression nodes using logical AND semantics.

The crash occurs because the query parser fails to validate operand types for GLOB before building the expression tree. When passed an integer literal (0) instead of a string, FTS5 creates an invalid node structure that dereferences a null pointer during evaluation. The trigram tokenizer exacerbates this by altering how text patterns are processed internally.

Fault Propagation: Type Validation Gaps in FTS5 Query Parsing

Three primary factors contribute to this segmentation fault:

  1. Implicit Type Conversion Mismatch

    • Problem: SQLite automatically converts integer 0 to string "0" for GLOB but FTS5 internals bypass this conversion
    • Effect: FTS5 expression parser receives raw integer value instead of expected string pattern
    • Code Path: fts5ParseGLOB (internal) fails to handle non-TEXT operands
  2. Uninitialized Expression Node Members

    • Location: sqlite3Fts5ExprAnd line 228253 (as per ASAN report)
    • Fault: Attempts to access pLeft->pNear when pLeft is null
    • Root Cause: fts5ExprParseTerm returns invalid node structure for non-string GLOB RHS
  3. Trigram Tokenizer Interaction

    • Tokenizer Hook: fts5TriTokenize modifies pattern handling
    • Side Effect: Empty token list generated for numeric GLOB patterns
    • Consequence: Triggers edge case in sqlite3Fts5ExprAnd null handling

The fault chain progresses as:

GLOB operand type mismatch 
→ Invalid FTS5 expression node creation 
→ Null child node in AND expression 
→ Null pointer dereference during evaluation

Resolution Framework: Type Enforcement and Expression Tree Validation

Step 1: Validate GLOB Operand Types at Parse Time

Modify FTS5 query parser to enforce TEXT type for GLOB right-hand operands:

// In fts5_expr.c (fts5ParseGLOB):
if( !sqlite3_value_is_text(pVal) ){
  sqlite3ErrorMsg(pParse, "GLOB requires string operand");
  return SQLITE_ERROR;
}

Step 2: Add Null Checks in Expression Combination Logic

Harden sqlite3Fts5ExprAnd against invalid nodes:

// Revised logic in sqlite3Fts5ExprAnd:
if( pLeft == 0 || pRight == 0 ){
  if( pLeft ) sqlite3Fts5ExprNodeFree(pLeft);
  if( pRight ) sqlite3Fts5ExprNodeFree(pRight);
  return SQLITE_ERROR;
}

Step 3: Handle Empty Token Lists in Trigram Tokenizer

Modify trigram tokenization to return explicit error for non-text inputs:

// In fts5_trigram.c (fts5TriTokenize):
if( pText==0 || nText==0 ){
  return SQLITE_ERROR;
}

Step 4: Update FTS5 Query Documentation

Explicitly state in SQLite documentation that:

  • FTS5 GLOB operator requires explicit string literals
  • Implicit type conversion not supported in FTS5 queries
  • Use CAST() when comparing non-text columns

Step 5: Add Regression Test Case

Implement test verifying GLOB operand type handling:

CREATE VIRTUAL TABLE t1 USING fts5(x, tokenize='trigram');
-- Should throw error instead of crash
SELECT * FROM t1 WHERE x GLOB 0;

Step 6: Recompile with Additional Sanitizers

Enhance build configuration to detect type mismatches:

export CFLAGS="$CFLAGS -fsanitize=undefined -fsanitize=float-divide-by-zero"

Step 7: Debugging Workflow for Similar Issues

  1. Reproduce with WHERE c0 GLOB CAST(0 AS TEXT)
  2. Examine sqlite3Fts5ExprNode structures using debugger
  3. Trace tokenizer output via PRAGMA vdbe_trace=1
  4. Check prepared statement structure using EXPLAIN

Final Code Patch Guidance

For SQLite maintainers: The essential fix requires modifying fts5_expr.c to validate operand types before building expression nodes and hardening node combination logic against null children. This prevents invalid tree structures from reaching the evaluation phase.

For application developers: Immediately replace numeric literals in FTS5 GLOB clauses with explicit string literals or parameter bindings. Example remediation:

-- Vulnerable query
SELECT 0 FROM t0(0) WHERE c0 GLOB 0;

-- Corrected version
SELECT 0 FROM t0('0') WHERE c0 GLOB '0';

This comprehensive approach addresses both the immediate null pointer dereference and underlying type validation weaknesses in FTS5 query processing.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *