Row Value Index Search Behavior with Rowid vs. WITHOUT ROWID Tables

Issue Overview: Partial Index Usage in Row Value Comparisons

The core problem revolves around unexpected query plan behavior when using row value comparisons (e.g., (a,b,i) > (?,?,?)) against indexes containing SQLite’s implicit rowid column. Users observe that when querying a standard rowid table with an index containing the rowid alias (e.g., i INTEGER PRIMARY KEY), the EXPLAIN QUERY PLAN output shows partial index usage. For example, an index on (a,b,i) appears to only use (a,b) for searching, while the same query against a WITHOUT ROWID table with identical schema uses all three columns. This discrepancy raises questions about index storage formats, query planner logic, and fundamental differences between rowid and WITHOUT ROWID table structures.

Key Observations:

  1. Rowid Table (t2):
    Unique index declared as (a,b,i) where i is the rowid alias.
    Query plan shows SEARCH USING COVERING INDEX ... ((a,b)>(?,?))
    Only first two columns used despite explicit three-column comparison

  2. WITHOUT ROWID Table (t3):
    Same index definition (a,b,i) (now primary key)
    Query plan shows full (a,b,i)>(?,?,?) usage

  3. Index Column Order Sensitivity:
    Changing index column order (e.g., (a,i,b)) causes query planner to truncate index usage at first occurrence of rowid alias

Fundamental Concepts at Play:

  • Implicit Rowid Storage: In standard tables, every index implicitly appends the rowid to its stored columns
  • Index Key Composition: For rowid tables, indexes effectively store [IndexedColumns,rowid] even if not explicitly declared
  • Row Value Comparisons: SQLite evaluates tuple comparisons lexically, requiring precise index alignment for optimal search

Possible Causes: Rowid Storage Mechanics and Index Truncation

1. Implicit Rowid Duplication in Indexes

Root Cause:
When creating an index on a rowid table that explicitly includes the rowid alias column (e.g., i INTEGER PRIMARY KEY), SQLite creates a redundant index structure. The physical index stores both the declared columns and the implicit rowid, leading to hidden column duplication.

Example Breakdown:
For table t2 with index (a,b,i):

  • Logical index definition: a, b, i
  • Physical storage: a, b, i, rowid (where i=rowid)
  • Effective key becomes a, b, i, i due to i being rowid alias

This redundancy forces the query planner to make non-obvious decisions about index usage boundaries.

2. Query Planner’s Index Truncation Heuristic

Behavior Trigger:
When the query planner detects consecutive identical columns in an index key (here, i appearing as both explicit column and implicit rowid), it truncates the searchable index columns at the first occurrence of the rowid alias. This occurs because:

  1. The rowid alias (i) and implicit rowid are guaranteed equal
  2. Including both creates logical contradictions in range comparisons
  3. The planner prioritizes avoiding comparison redundancy

Critical Implications:

  • Indexes ending with rowid alias columns become partially unusable for row value comparisons
  • Query plans may show fewer columns used than expected
  • Actual bytecode execution (via EXPLAIN) might still process all columns despite plan output

3. WITHOUT ROWID vs Rowid Table Index Storage

Structural Difference:
WITHOUT ROWID tables store indexes without implicit rowid appending. An index on (a,b,i) in a WITHOUT ROWID table stores exactly those three columns, enabling full usage in row value comparisons. This explains why table t3 shows complete index usage in its query plan.

Storage Comparison:

Table TypeDeclared IndexPhysical Storage
Rowid(a,b,i)(a,b,i,rowid) → (a,b,i,i)
WITHOUT ROWID(a,b,i)(a,b,i)

4. Row Value Comparison Semantics

Lexical Comparison Nuance:
SQLite compares row values component-wise, similar to string sorting:

(a1,b1,i1) > (a2,b2,i2)  
⇨ (a1 > a2) OR (a1=a2 AND b1 > b2) OR (a1=a2 AND b1=b2 AND i1 > i2)

When the index contains duplicate columns (due to implicit rowid), the comparison becomes:

(a1,b1,i1,i1) > (a2,b2,i2,i2)

The fourth column (i1 vs i2) comparison is redundant because i1=i1 and i2=i2 always hold. This redundancy confuses the query planner’s range optimization logic.

Troubleshooting Steps, Solutions & Fixes

Step 1: Diagnose Implicit Index Structure

Action: Use PRAGMA index_xinfo(index_name) to reveal hidden columns.

Example for t2’s autoindex:

PRAGMA index_xinfo(sqlite_autoindex_t2_1);

Expected output:

seqnamedesccollkeyasc
0a0BINARY11
1b0BINARY11
2i0BINARY11
3rowid0BINARY01

Analysis:

  • Column 3 (rowid) is implicitly added despite i being rowid alias
  • Creates logical duplicate between column 2 (i) and 3 (rowid)

Step 2: Avoid Redundant Rowid Inclusion

Solution: Never explicitly include rowid alias columns in indexes for rowid tables.

Correct Approach for t2:

CREATE TABLE t2 (
  i INTEGER PRIMARY KEY,
  a INTEGER NOT NULL,
  b INTEGER NOT NULL
);
CREATE INDEX t2_ab ON t2(a, b);  -- Implicitly includes rowid

Query Adjustment:

SELECT * FROM t2
WHERE (a,b,i) > (?,?,?)  -- i=rowid still comparable
ORDER BY a,b,i
LIMIT 10;

Rationale:

  • Index t2_ab physically stores (a,b,rowid)
  • i is equivalent to rowid, so comparison works
  • Query plan will show SEARCH ... USING INDEX t2_ab ((a,b) > (?,?)) but bytecode uses full index

Step 3: Bytecode-Level Verification

Action: Use EXPLAIN instead of EXPLAIN QUERY PLAN to see actual opcodes.

Example:

EXPLAIN
SELECT * FROM t2 WHERE (a,b,i) > (?,?,?) ORDER BY a,b,i LIMIT 10;

Key Opcodes to Check:

  • ColumnCount: Verify all compared columns are accessed
  • IdxGE: Look for operands covering all index columns
  • DecrJumpZero: Check loop termination conditions

Interpretation:
Even if EXPLAIN QUERY PLAN shows partial index usage, bytecode often reveals full index traversal due to implicit rowid inclusion.

Step 4: Index Column Order Optimization

Rule: Place rowid alias columns after all other indexed columns to prevent truncation.

Incorrect Index:

CREATE INDEX bad_idx ON t2(a, i, b);  -- i=rowid in middle

Query Plan Output:
SEARCH t2 USING INDEX bad_idx (a>?)

Correct Index:

CREATE INDEX good_idx ON t2(a, b);  -- rowid appended automatically

Behavior:

  • Physical index: (a, b, rowid)
  • Supports (a,b,i) > (?,?,?) comparisons via bytecode

Step 5: Consider WITHOUT ROWID When Appropriate

Use Case: Tables requiring clustered indexes or explicit PK ordering.

Conversion Example:

CREATE TABLE t3 (
  i INTEGER PRIMARY KEY,
  a INTEGER NOT NULL,
  b INTEGER NOT NULL
) WITHOUT ROWID;

Index Behavior:

  • Primary key (i) becomes clustered index
  • Additional index (a,b,i) stores exact columns
  • Full index usage in row value comparisons

Tradeoffs:

  • Increased storage space (no implicit rowid)
  • Update overhead for clustered indexes

Step 6: Forced Index Usage Clarification

Technique: Use INDEXED BY to override planner choices.

Example:

SELECT *
FROM t2 INDEXED BY sqlite_autoindex_t2_1
WHERE (a,b,i) > (?,?,?)
ORDER BY a,b,i
LIMIT 10;

Effect:

  • Bypasses query planner’s truncation heuristic
  • Forces full index scan despite redundant columns
  • Verify with EXPLAIN to ensure desired opcodes

Step 7: Query Rewrite for Partial Comparisons

Alternative Approach: Break row value comparison into column-wise clauses.

Original Query:

WHERE (a,b,i) > (?,?,?)

Rewritten:

WHERE a > ?1
   OR (a = ?1 AND b > ?2)
   OR (a = ?1 AND b = ?2 AND i > ?3)

Advantage:

  • Explicit control over comparison logic
  • Works better with compound indexes
  • Planner may use MULTI-INDEX OR optimization

Step 8: Version-Specific Behavior Testing

Consideration: Test across SQLite versions (3.43.0+ shown in examples).

Checklist:

  • Confirm row_value > ? optimization exists in target version
  • Review SQLite Release Notes for index changes
  • Test with latest version using sqlite3_version() function

Step 9: Index Redesign for Rowid Tables

Optimal Index Strategy:

  1. Never include rowid alias in secondary indexes
  2. Leverage implicit rowid appending
  3. Use covering indexes where possible

Example Schema:

CREATE TABLE optimized (
  id INTEGER PRIMARY KEY,
  x INTEGER NOT NULL,
  y INTEGER NOT NULL
);
CREATE INDEX optimized_xy ON optimized(x, y);

Query Efficiency:

-- Uses index (x,y,id) implicitly
SELECT * FROM optimized
WHERE (x,y,id) > (?,?,?)
ORDER BY x,y,id
LIMIT 10;

Step 10: Understanding Storage Engine Internals

Deep Dive: Study SQLite’s index record format from File Format Documentation.

Key Takeaways:

  1. Index records store:
    • Header (type info)
    • Key columns
    • Rowid (for rowid tables)
  2. WITHOUT ROWID tables omit the trailing rowid
  3. Comparison logic operates on serialized B-tree cells

Practical Impact:
Explicit rowid inclusion creates larger index records with duplicated data, affecting both storage and comparison efficiency.

Final Recommendation Summary

  1. Avoid Explicit Rowid in Indexes: Let SQLite handle implicit appending
  2. Prefer WITHOUT ROWID for PK Clustering: When full index control is critical
  3. Validate with EXPLAIN: Don’t rely solely on EXPLAIN QUERY PLAN
  4. Monitor Version Updates: Optimization behaviors evolve
  5. Index Column Order Matters: Place rowid aliases last if unavoidable

By aligning index design with SQLite’s storage mechanics and comparison logic, developers can achieve optimal row value comparison performance while avoiding query planner confusion. The observed behavior isn’t a bug but rather a consequence of SQLite’s rowid handling optimizations, necessitating deliberate schema design to work harmoniously with the engine’s internals.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *