Optimizing SQLite CTE Performance for Combined Queries with UNION ALL

Understanding Query Performance Degradation in Combined CTE and UNION ALL Operations

Issue Overview: Unexpected Slowdown When Consolidating Multiple Queries into a Single CTE-Union Structure

The core challenge involves a significant performance degradation when attempting to consolidate five separate SQLite queries into a single query using a Common Table Expression (CTE) and UNION ALL clauses. The original workflow executed five distinct operations against a table filtered by state = 'Kentucky':

  1. A filtered row selection with ordering and a limit (78ms)
  2. A total row count (7ms)
  3. Three grouped aggregation queries for facets on state, county, and fips (12ms, 50ms, 52ms)

The combined query uses a filtered CTE to share the base dataset across subsequent UNION ALL components. Despite expectations of reduced redundant computation, the consolidated query underperforms—taking 200ms or more, compared to 199ms for individual queries. Key observations include:

  • Identical filter logic (state = 'Kentucky') across all components
  • Repeated references to the filtered CTE
  • Use of ORDER BY and LIMIT in subqueries
  • Structural alignment via UNION ALL with NULL placeholders

This anomaly highlights a conflict between intuitive query design and SQLite’s query planner behavior, particularly around CTE materialization and optimization boundaries.

Possible Causes: CTE Materialization Policies, Index Utilization, and Query Planner Limitations

1. Non-Materialized CTE in SQLite Versions Prior to 3.35

SQLite versions before 3.35 default to NOT MATERIALIZED for CTEs, treating them as inline views rather than temporary tables. When a CTE is referenced multiple times (as filtered is here), each reference re-executes the base query, negating any efficiency gains from shared computation. In this case, the filtered CTE is referenced six times (once in each subquery and once in the rows CTE), forcing six full scans of ny_times_us_counties with the state = 'Kentucky' filter. This contrasts with the original separate queries, which each executed the filter once but with potential index optimizations.

2. Inefficient Index Usage in Aggregation Subqueries

The original separate queries may benefit from indexes on state, county, fips, or composite indexes that accelerate filtering and grouping. When consolidated into a single query, SQLite’s query planner may fail to apply these indexes optimally to the CTE-derived data. For example:

  • The facet_state subquery groups by state, but if the filtered CTE isn’t materialized, the lack of an index on the virtual result set forces a full scan and temporary sort.
  • The rows subquery’s ORDER BY date DESC LIMIT 101 might use an index on date when run standalone but resort to a full scan and sort when embedded in the CTE structure.

3. Optimization Barrier Introduced by UNION ALL

SQLite treats each UNION ALL component as a separate subquery, creating optimization barriers. The query planner cannot reorder operations across UNION ALL boundaries or deduplicate work between them. This forces sequential execution of each subquery without cross-component optimizations, such as sharing sorts or intermediate results. The structural alignment requirement—using NULL placeholders to match column counts across UNION ALL—further complicates type affinity and storage allocation, adding overhead.

4. Resource Contention and Temporary Storage Overhead

Combining multiple operations into one query increases memory and temporary storage demands. SQLite may spill intermediate results to disk (e.g., during sorts for ORDER BY or GROUP BY) more aggressively in a single large query than in smaller, discrete queries. This I/O overhead compounds with the repeated CTE executions.

Troubleshooting Steps, Solutions & Fixes: Addressing Materialization, Indexing, and Query Structure

1. Upgrade to SQLite 3.35+ and Enforce CTE Materialization

SQLite 3.35 introduced automatic materialization for CTEs referenced multiple times. Upgrading to this version allows the filtered CTE to be evaluated once and stored in a temporary table, eliminating redundant base table scans. If upgrading isn’t feasible, simulate materialization using a temporary table:

CREATE TEMP TABLE temp_filtered AS 
SELECT rowid, date, county, state, fips, cases, deaths 
FROM ny_times_us_counties 
WHERE state = 'Kentucky';

Then replace all filtered CTE references with temp_filtered.

2. Optimize Indexes for Filtered and Aggregated Columns

Ensure the base table has indexes that support both the initial filter and subsequent aggregations:

  • Composite Index on Filter Column + Aggregation Columns:
    CREATE INDEX idx_ny_times_us_counties_state ON ny_times_us_counties (state, date, county, fips);
    

    This index accelerates the state = 'Kentucky' filter and provides ordered data for date DESC sorting and GROUP BY operations.

  • Covering Indexes for Facet Aggregations:
    For each facet (state, county, fips), create indexes that allow "index-only" scans:

    CREATE INDEX idx_ny_times_us_counties_state_covering ON ny_times_us_counties (state) INCLUDE (cases, deaths);
    

    Adjust included columns based on actual query projections.

3. Refactor the Query to Minimize CTE Reuse and Leverage Subquery Flattening

Avoid over-reliance on CTEs in versions where materialization isn’t guaranteed. Instead, use subqueries or derived tables that allow the query planner to flatten and optimize:

SELECT * FROM (
  SELECT null AS facet, null AS facet_name, null AS facet_value, 
         rowid, date, county, state, fips, cases, deaths 
  FROM ny_times_us_counties 
  WHERE state = 'Kentucky' 
  ORDER BY date DESC LIMIT 101
) AS rows
UNION ALL
SELECT 'COUNT' AS facet, null, COUNT(*), null, null, null, null, null, null, null 
FROM ny_times_us_counties 
WHERE state = 'Kentucky'
UNION ALL
... [similar for other facets];

This structure may allow SQLite to apply indexes more effectively and avoid CTE-related overhead.

4. Profile Query Execution with EXPLAIN and EXPLAIN QUERY PLAN

Use SQLite’s diagnostic tools to compare the execution strategies of the original and consolidated queries:

EXPLAIN QUERY PLAN
WITH filtered AS (...)
... [rest of the query];

Key indicators to check:

  • SCAN TABLE ny_times_us_counties vs. SEARCH using an index
  • USE TEMP B-TREE FOR ORDER BY or GROUP BY
  • MATERIALIZE directives for CTEs

If the filtered CTE shows multiple SCAN or SEARCH operations, confirm that materialization isn’t occurring.

5. Benchmark Alternative Query Structures

Experiment with alternative approaches to consolidate operations without CTEs:

  • Combine Aggregations with Window Functions:
    Use a single pass over the data to compute counts and facets:

    SELECT 
      'COUNT' AS facet, NULL AS facet_name, COUNT(*) AS facet_value,
      NULL, NULL, NULL, NULL, NULL, NULL, NULL
    FROM ny_times_us_counties 
    WHERE state = 'Kentucky'
    UNION ALL
    SELECT 
      'state', state, COUNT(*),
      NULL, NULL, NULL, NULL, NULL, NULL, NULL
    FROM ny_times_us_counties 
    WHERE state = 'Kentucky' 
    GROUP BY state
    ... [similar for county, fips];
    
  • Leverage Pagination and Deferred Sorting:
    Separate the ORDER BY date DESC LIMIT 101 operation into a standalone query, as it may conflict with aggregation optimizations.

6. Adjust SQLite Runtime Configuration

Tune settings to prioritize memory over temporary storage and enable optimizations:

PRAGMA temp_store = MEMORY; -- Keep temp data in RAM
PRAGMA cache_size = -100000; -- Allocate 100MB cache
PRAGMA mmap_size = 268435456; -- Use memory-mapped I/O for large datasets

7. Consider Application-Level Caching or Precomputation

If query latency remains unacceptable, precompute facet counts and store them in auxiliary tables refreshed periodically or via triggers. This shifts computational overhead to write operations, which may be preferable for read-heavy applications like Datasette.

By systematically addressing CTE materialization, index design, and query structure, significant performance improvements can be achieved—often surpassing the original separate-query approach. The interplay between SQLite’s optimizer, version-specific behaviors, and physical schema design necessitates a holistic tuning strategy tailored to the specific workload and data profile.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *