Covering Index Not Utilized for Virtual or Stored Columns in SQLite

Index Storage Behavior for Virtual/Stored Columns & Query Coverage

Issue Overview
When querying columns that are part of an index (including virtual or stored generated columns), SQLite may fail to use the index as a covering index, resulting in unnecessary table accesses. This manifests in query plans showing USING INDEX instead of USING COVERING INDEX when selecting indexed virtual/stored columns. For example:

  • A table tbl has a virtual column c defined as b+1 and an index idx on (a, c).
  • SELECT a FROM tbl uses idx as a covering index.
  • SELECT a, c FROM tbl uses idx but still accesses the table, despite a and c being stored in the index.

This behavior contradicts expectations because covering indexes should eliminate table access when all queried columns exist in the index. The problem persists even with stored (non-virtual) columns or when indexing expressions directly (e.g., b+1). The root cause involves SQLite’s internal handling of generated columns and expression-based indexes during query optimization.


Structural Limitations of Indexes on Generated Columns

Possible Causes

  1. Index Key Composition with RowID:
    SQLite appends the table’s implicit ROWID to every secondary index key unless the table is WITHOUT ROWID. For tbl, the index idx stores (a, c, ROWID). Queries selecting only a avoid table access because the optimizer recognizes that a is fully covered by the index prefix. However, when selecting a and c, the optimizer may erroneously require the ROWID to locate the table row, even though c is stored in the index. This occurs because SQLite’s query planner does not fully propagate metadata about stored generated column values to the coverage-checking logic.

  2. Generated Column Dependency Tracking:
    Virtual columns depend on other columns (e.g., c = b+1). While the index stores the computed value of c, the optimizer may conservatively assume that accessing the base table is necessary to validate dependencies or recalculate the value (even though recomputation is redundant for stored/indexed values). This is a heuristic flaw in the planner’s dependency resolution.

  3. Expression Indexes and Column Aliasing:
    When an index is created on an expression (e.g., b+1), SQLite does not map the expression to column aliases (like virtual column c). Thus, SELECT a, c (where c is an alias for b+1) may not trigger a covering scan for an index on b+1, as the planner treats the expression and the alias as distinct entities.

  4. Primary Key Interactions:
    The primary key (a, b) influences index structure. Since c depends on b, the planner may incorrectly infer that accessing b from the table is necessary to compute c, even though c is precomputed in the index. This is a missed optimization opportunity when analyzing column dependencies.


Schema Adjustments and Planner Directives for Coverage

Troubleshooting Steps, Solutions & Fixes

Step 1: Verify Index Content and Query Plans
Use PRAGMA index_info('idx'); to confirm the indexed columns. For the example schema, this will show columns a (seq=0) and c (seq=1). The presence of c in the index confirms it should be coverable.

Step 2: Force Covering Scans with Index Hints
Add redundant WHERE clauses to guide the planner:

SELECT a, c FROM tbl INDEXED BY idx WHERE c IS NOT NULL;

This hints that c is already materialized in the index, bypassing dependency checks.

Step 3: Use COVERING INDEX Syntax (Advanced)
Modify the index to include all queried columns explicitly. For tables with strict storage requirements:

CREATE INDEX idx ON tbl(a, c) INCLUDE (c);  -- SQLite 3.39+  

However, SQLite does not support INCLUDE for generated columns. Instead, redefine the index to embed dependencies:

CREATE INDEX idx ON tbl(a, c, b);  

This includes b to satisfy dependency checks, though it increases index size.

Step 4: Rewrite Queries to Reference Indexed Expressions Directly
If c is a virtual column, query the underlying expression:

SELECT a, b+1 FROM tbl INDEXED BY idx;  

Ensure the index is created on (a, b+1) instead of (a, c). This aligns the query with the index’s expression, allowing coverage.

Step 5: Migrate to WITHOUT ROWID Tables
Convert tbl to a WITHOUT ROWID table, as its primary key (a, b) becomes the clustering key. Secondary indexes then reference the primary key directly instead of the ROWID:

CREATE TABLE tbl (a INTEGER NOT NULL, b INTEGER NOT NULL, c INTEGER NOT NULL AS (b+1) VIRTUAL, PRIMARY KEY (a, b)) WITHOUT ROWID;  
CREATE INDEX idx ON tbl(a, c);  

Indexes on WITHOUT ROWID tables avoid ROWID appends, potentially resolving coverage ambiguity.

Step 6: Update SQLite Version and Rebuild Schema
Ensure SQLite is updated to the latest version. Schema rebuilds can refresh internal statistics:

VACUUM;  
REINDEX;  

Final Workaround: Materialize Critical Columns
For critical queries, create a materialized view with triggers to replicate tbl’s data, including c as a physical column. This eliminates generated column overhead but increases storage.

Conclusion
SQLite’s query planner occasionally fails to recognize covering index opportunities for generated columns due to conservative dependency analysis and ROWID handling. Adjusting index structure, query syntax, or table schema often resolves the issue. For persistent cases, consider filing a bug report with the SQLite team, including reproducible examples.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *