Segmentation Fault in FTS5 GLOB Query Due to Null Pointer Dereference
FTS5 Virtual Table GLOB Query Triggers Null Pointer Dereference in sqlite3Fts5ExprAnd
Issue Context: Faulty Expression Tree Construction During FTS5 GLOB Evaluation
The segmentation fault occurs during execution of a FTS5 virtual table query involving the GLOB operator applied to an integer literal. The crash manifests in sqlite3Fts5ExprAnd
, a function responsible for combining FTS5 expression nodes during query evaluation. The root cause lies in invalid expression tree construction when processing the GLOB operator with a non-text right-hand operand.
Key technical relationships:
- FTS5 Virtual Table Engine: Handles text search operations through specialized tokenizers (trigram in this case).
- GLOB Operator: Implements pattern matching but requires string operands.
- Expression Tree Construction: FTS5 builds an internal representation of query logic through nodes like
Fts5ExprNode
. - sqlite3Fts5ExprAnd: Combines child expression nodes using logical AND semantics.
The crash occurs because the query parser fails to validate operand types for GLOB before building the expression tree. When passed an integer literal (0
) instead of a string, FTS5 creates an invalid node structure that dereferences a null pointer during evaluation. The trigram tokenizer exacerbates this by altering how text patterns are processed internally.
Fault Propagation: Type Validation Gaps in FTS5 Query Parsing
Three primary factors contribute to this segmentation fault:
Implicit Type Conversion Mismatch
- Problem: SQLite automatically converts integer
0
to string"0"
for GLOB but FTS5 internals bypass this conversion - Effect: FTS5 expression parser receives raw integer value instead of expected string pattern
- Code Path:
fts5ParseGLOB
(internal) fails to handle non-TEXT operands
- Problem: SQLite automatically converts integer
Uninitialized Expression Node Members
- Location:
sqlite3Fts5ExprAnd
line 228253 (as per ASAN report) - Fault: Attempts to access
pLeft->pNear
whenpLeft
is null - Root Cause:
fts5ExprParseTerm
returns invalid node structure for non-string GLOB RHS
- Location:
Trigram Tokenizer Interaction
- Tokenizer Hook:
fts5TriTokenize
modifies pattern handling - Side Effect: Empty token list generated for numeric GLOB patterns
- Consequence: Triggers edge case in
sqlite3Fts5ExprAnd
null handling
- Tokenizer Hook:
The fault chain progresses as:
GLOB operand type mismatch
→ Invalid FTS5 expression node creation
→ Null child node in AND expression
→ Null pointer dereference during evaluation
Resolution Framework: Type Enforcement and Expression Tree Validation
Step 1: Validate GLOB Operand Types at Parse Time
Modify FTS5 query parser to enforce TEXT type for GLOB right-hand operands:
// In fts5_expr.c (fts5ParseGLOB):
if( !sqlite3_value_is_text(pVal) ){
sqlite3ErrorMsg(pParse, "GLOB requires string operand");
return SQLITE_ERROR;
}
Step 2: Add Null Checks in Expression Combination Logic
Harden sqlite3Fts5ExprAnd
against invalid nodes:
// Revised logic in sqlite3Fts5ExprAnd:
if( pLeft == 0 || pRight == 0 ){
if( pLeft ) sqlite3Fts5ExprNodeFree(pLeft);
if( pRight ) sqlite3Fts5ExprNodeFree(pRight);
return SQLITE_ERROR;
}
Step 3: Handle Empty Token Lists in Trigram Tokenizer
Modify trigram tokenization to return explicit error for non-text inputs:
// In fts5_trigram.c (fts5TriTokenize):
if( pText==0 || nText==0 ){
return SQLITE_ERROR;
}
Step 4: Update FTS5 Query Documentation
Explicitly state in SQLite documentation that:
- FTS5 GLOB operator requires explicit string literals
- Implicit type conversion not supported in FTS5 queries
- Use CAST() when comparing non-text columns
Step 5: Add Regression Test Case
Implement test verifying GLOB operand type handling:
CREATE VIRTUAL TABLE t1 USING fts5(x, tokenize='trigram');
-- Should throw error instead of crash
SELECT * FROM t1 WHERE x GLOB 0;
Step 6: Recompile with Additional Sanitizers
Enhance build configuration to detect type mismatches:
export CFLAGS="$CFLAGS -fsanitize=undefined -fsanitize=float-divide-by-zero"
Step 7: Debugging Workflow for Similar Issues
- Reproduce with
WHERE c0 GLOB CAST(0 AS TEXT)
- Examine
sqlite3Fts5ExprNode
structures using debugger - Trace tokenizer output via
PRAGMA vdbe_trace=1
- Check prepared statement structure using
EXPLAIN
Final Code Patch Guidance
For SQLite maintainers: The essential fix requires modifying fts5_expr.c
to validate operand types before building expression nodes and hardening node combination logic against null children. This prevents invalid tree structures from reaching the evaluation phase.
For application developers: Immediately replace numeric literals in FTS5 GLOB clauses with explicit string literals or parameter bindings. Example remediation:
-- Vulnerable query
SELECT 0 FROM t0(0) WHERE c0 GLOB 0;
-- Corrected version
SELECT 0 FROM t0('0') WHERE c0 GLOB '0';
This comprehensive approach addresses both the immediate null pointer dereference and underlying type validation weaknesses in FTS5 query processing.