Issue Overview: Partial Index Usage in Row Value Comparisons

The core problem revolves around unexpected query plan behavior when using row value comparisons (e.g., (a,b,i) > (?,?,?)) against indexes containing SQLite’s implicit rowid column. Users observe that when querying a standard rowid table with an index containing the rowid alias (e.g., i INTEGER PRIMARY KEY), the EXPLAIN QUERY PLAN output shows partial index usage. For example, an index on (a,b,i) appears to only use (a,b) for searching, while the same query against a WITHOUT ROWID table with identical schema uses all three columns. This discrepancy raises questions about index storage formats, query planner logic, and fundamental differences between rowid and WITHOUT ROWID table structures.

Key Observations:

Rowid Table (t2):
Unique index declared as (a,b,i) where i is the rowid alias.
Query plan shows SEARCH USING COVERING INDEX ... ((a,b)>(?,?))
Only first two columns used despite explicit three-column comparison
WITHOUT ROWID Table (t3):
Same index definition (a,b,i) (now primary key)
Query plan shows full (a,b,i)>(?,?,?) usage
Index Column Order Sensitivity:
Changing index column order (e.g., (a,i,b)) causes query planner to truncate index usage at first occurrence of rowid alias

Fundamental Concepts at Play:

Implicit Rowid Storage: In standard tables, every index implicitly appends the rowid to its stored columns
Index Key Composition: For rowid tables, indexes effectively store [IndexedColumns,rowid] even if not explicitly declared
Row Value Comparisons: SQLite evaluates tuple comparisons lexically, requiring precise index alignment for optimal search

Possible Causes: Rowid Storage Mechanics and Index Truncation

1. Implicit Rowid Duplication in Indexes

Root Cause:
When creating an index on a rowid table that explicitly includes the rowid alias column (e.g., i INTEGER PRIMARY KEY), SQLite creates a redundant index structure. The physical index stores both the declared columns and the implicit rowid, leading to hidden column duplication.

Example Breakdown:
For table t2 with index (a,b,i):

Logical index definition: a, b, i
Physical storage: a, b, i, rowid (where i=rowid)
Effective key becomes a, b, i, i due to i being rowid alias

This redundancy forces the query planner to make non-obvious decisions about index usage boundaries.

2. Query Planner’s Index Truncation Heuristic

Behavior Trigger:
When the query planner detects consecutive identical columns in an index key (here, i appearing as both explicit column and implicit rowid), it truncates the searchable index columns at the first occurrence of the rowid alias. This occurs because:

The rowid alias (i) and implicit rowid are guaranteed equal
Including both creates logical contradictions in range comparisons
The planner prioritizes avoiding comparison redundancy

Critical Implications:

Indexes ending with rowid alias columns become partially unusable for row value comparisons
Query plans may show fewer columns used than expected
Actual bytecode execution (via EXPLAIN) might still process all columns despite plan output

3. WITHOUT ROWID vs Rowid Table Index Storage

Structural Difference:
WITHOUT ROWID tables store indexes without implicit rowid appending. An index on (a,b,i) in a WITHOUT ROWID table stores exactly those three columns, enabling full usage in row value comparisons. This explains why table t3 shows complete index usage in its query plan.

Storage Comparison:

Table Type	Declared Index	Physical Storage
Rowid	(a,b,i)	(a,b,i,rowid) → (a,b,i,i)
WITHOUT ROWID	(a,b,i)	(a,b,i)

4. Row Value Comparison Semantics

Lexical Comparison Nuance:
SQLite compares row values component-wise, similar to string sorting:

(a1,b1,i1) > (a2,b2,i2)  
⇨ (a1 > a2) OR (a1=a2 AND b1 > b2) OR (a1=a2 AND b1=b2 AND i1 > i2)

When the index contains duplicate columns (due to implicit rowid), the comparison becomes:

(a1,b1,i1,i1) > (a2,b2,i2,i2)

The fourth column (i1 vs i2) comparison is redundant because i1=i1 and i2=i2 always hold. This redundancy confuses the query planner’s range optimization logic.

Troubleshooting Steps, Solutions & Fixes

Step 1: Diagnose Implicit Index Structure

Action: Use PRAGMA index_xinfo(index_name) to reveal hidden columns.

Example for t2’s autoindex:

PRAGMA index_xinfo(sqlite_autoindex_t2_1);

Expected output:

seq	name	coll	key	asc
0	a	BINARY	1	1
1	b	BINARY	1	1
2	i	BINARY	1	1
3	rowid	BINARY	0	1

Analysis:

Column 3 (rowid) is implicitly added despite i being rowid alias
Creates logical duplicate between column 2 (i) and 3 (rowid)

Step 2: Avoid Redundant Rowid Inclusion

Solution: Never explicitly include rowid alias columns in indexes for rowid tables.

Correct Approach for t2:

CREATE TABLE t2 (
  i INTEGER PRIMARY KEY,
  a INTEGER NOT NULL,
  b INTEGER NOT NULL
);
CREATE INDEX t2_ab ON t2(a, b);  -- Implicitly includes rowid

Query Adjustment:

SELECT * FROM t2
WHERE (a,b,i) > (?,?,?)  -- i=rowid still comparable
ORDER BY a,b,i
LIMIT 10;

Rationale:

Index t2_ab physically stores (a,b,rowid)
i is equivalent to rowid, so comparison works
Query plan will show SEARCH ... USING INDEX t2_ab ((a,b) > (?,?)) but bytecode uses full index

Step 3: Bytecode-Level Verification

Action: Use EXPLAIN instead of EXPLAIN QUERY PLAN to see actual opcodes.

Example:

EXPLAIN
SELECT * FROM t2 WHERE (a,b,i) > (?,?,?) ORDER BY a,b,i LIMIT 10;

Key Opcodes to Check:

ColumnCount: Verify all compared columns are accessed
IdxGE: Look for operands covering all index columns
DecrJumpZero: Check loop termination conditions

Interpretation:
Even if EXPLAIN QUERY PLAN shows partial index usage, bytecode often reveals full index traversal due to implicit rowid inclusion.

Step 4: Index Column Order Optimization

Rule: Place rowid alias columns after all other indexed columns to prevent truncation.

Incorrect Index:

CREATE INDEX bad_idx ON t2(a, i, b);  -- i=rowid in middle

Query Plan Output:
SEARCH t2 USING INDEX bad_idx (a>?)

Correct Index:

CREATE INDEX good_idx ON t2(a, b);  -- rowid appended automatically

Behavior:

Physical index: (a, b, rowid)
Supports (a,b,i) > (?,?,?) comparisons via bytecode

Step 5: Consider WITHOUT ROWID When Appropriate

Use Case: Tables requiring clustered indexes or explicit PK ordering.

Conversion Example:

CREATE TABLE t3 (
  i INTEGER PRIMARY KEY,
  a INTEGER NOT NULL,
  b INTEGER NOT NULL
) WITHOUT ROWID;

Index Behavior:

Primary key (i) becomes clustered index
Additional index (a,b,i) stores exact columns
Full index usage in row value comparisons

Tradeoffs:

Increased storage space (no implicit rowid)
Update overhead for clustered indexes

Step 6: Forced Index Usage Clarification

Technique: Use INDEXED BY to override planner choices.

Example:

SELECT *
FROM t2 INDEXED BY sqlite_autoindex_t2_1
WHERE (a,b,i) > (?,?,?)
ORDER BY a,b,i
LIMIT 10;

Effect:

Bypasses query planner’s truncation heuristic
Forces full index scan despite redundant columns
Verify with EXPLAIN to ensure desired opcodes

Step 7: Query Rewrite for Partial Comparisons

Alternative Approach: Break row value comparison into column-wise clauses.

Original Query:

WHERE (a,b,i) > (?,?,?)

Rewritten:

WHERE a > ?1
   OR (a = ?1 AND b > ?2)
   OR (a = ?1 AND b = ?2 AND i > ?3)

Advantage:

Explicit control over comparison logic
Works better with compound indexes
Planner may use MULTI-INDEX OR optimization

Step 8: Version-Specific Behavior Testing

Consideration: Test across SQLite versions (3.43.0+ shown in examples).

Checklist:

Confirm row_value > ? optimization exists in target version
Review SQLite Release Notes for index changes
Test with latest version using sqlite3_version() function

Step 9: Index Redesign for Rowid Tables

Optimal Index Strategy:

Never include rowid alias in secondary indexes
Leverage implicit rowid appending
Use covering indexes where possible

Example Schema:

CREATE TABLE optimized (
  id INTEGER PRIMARY KEY,
  x INTEGER NOT NULL,
  y INTEGER NOT NULL
);
CREATE INDEX optimized_xy ON optimized(x, y);

Query Efficiency:

-- Uses index (x,y,id) implicitly
SELECT * FROM optimized
WHERE (x,y,id) > (?,?,?)
ORDER BY x,y,id
LIMIT 10;

Step 10: Understanding Storage Engine Internals

Deep Dive: Study SQLite’s index record format from File Format Documentation.

Key Takeaways:

Index records store:
- Header (type info)
- Key columns
- Rowid (for rowid tables)
WITHOUT ROWID tables omit the trailing rowid
Comparison logic operates on serialized B-tree cells

Practical Impact:
Explicit rowid inclusion creates larger index records with duplicated data, affecting both storage and comparison efficiency.

Final Recommendation Summary

Avoid Explicit Rowid in Indexes: Let SQLite handle implicit appending
Prefer WITHOUT ROWID for PK Clustering: When full index control is critical
Validate with EXPLAIN: Don’t rely solely on EXPLAIN QUERY PLAN
Monitor Version Updates: Optimization behaviors evolve
Index Column Order Matters: Place rowid aliases last if unavoidable

By aligning index design with SQLite’s storage mechanics and comparison logic, developers can achieve optimal row value comparison performance while avoiding query planner confusion. The observed behavior isn’t a bug but rather a consequence of SQLite’s rowid handling optimizations, necessitating deliberate schema design to work harmoniously with the engine’s internals.

Row Value Index Search Behavior with Rowid vs. WITHOUT ROWID Tables

Issue Overview: Partial Index Usage in Row Value Comparisons

Key Observations:

Fundamental Concepts at Play:

Possible Causes: Rowid Storage Mechanics and Index Truncation

1. Implicit Rowid Duplication in Indexes

2. Query Planner’s Index Truncation Heuristic

3. WITHOUT ROWID vs Rowid Table Index Storage

4. Row Value Comparison Semantics

Troubleshooting Steps, Solutions & Fixes

Step 1: Diagnose Implicit Index Structure

Step 2: Avoid Redundant Rowid Inclusion

Step 3: Bytecode-Level Verification

Step 4: Index Column Order Optimization

Step 5: Consider WITHOUT ROWID When Appropriate

Step 6: Forced Index Usage Clarification

Step 7: Query Rewrite for Partial Comparisons

Step 8: Version-Specific Behavior Testing

Step 9: Index Redesign for Rowid Tables

Step 10: Understanding Storage Engine Internals

Final Recommendation Summary

Clustered and Non-Clustered Indexes in SQLite: A Comprehensive Guide

Optimizing SQLite `PRAGMA integrity_check` Performance for Large-Scale ETL Processes

SQLite Update Behavior and Optimizing No-Op Updates

Slow Cross-Database Joins in SQLite: Diagnosis and Optimization

Optimizing SQLite Queries for High-Volume Ticket Reporting on Embedded Devices

Tracking Prepared Statement Finalization and Performance Metrics in SQLite

Leave a Reply Cancel reply

Issue Overview: Partial Index Usage in Row Value Comparisons

Key Observations:

Fundamental Concepts at Play:

Possible Causes: Rowid Storage Mechanics and Index Truncation

1. Implicit Rowid Duplication in Indexes

2. Query Planner’s Index Truncation Heuristic

3. WITHOUT ROWID vs Rowid Table Index Storage

4. Row Value Comparison Semantics

Troubleshooting Steps, Solutions & Fixes

Step 1: Diagnose Implicit Index Structure

Step 2: Avoid Redundant Rowid Inclusion

Step 3: Bytecode-Level Verification

Step 4: Index Column Order Optimization

Step 5: Consider WITHOUT ROWID When Appropriate

Step 6: Forced Index Usage Clarification

Step 7: Query Rewrite for Partial Comparisons

Step 8: Version-Specific Behavior Testing

Step 9: Index Redesign for Rowid Tables

Step 10: Understanding Storage Engine Internals

Final Recommendation Summary

Related Guides

Leave a Reply Cancel reply