Unexpected RIGHT JOIN Results with WHERE Clause on View in SQLite


Issue Overview: RIGHT OUTER JOIN Queries with Views and WHERE Clauses Return Inconsistent Counts

The core issue arises when combining RIGHT OUTER JOIN operations involving views and tables in SQLite, particularly when applying WHERE clauses to filter results. Users observed that three nearly identical queries produced inconsistent counts, defying logical expectations. Here’s a breakdown of the problem:

  1. First Query:

    SELECT COUNT(*) FROM rt0 LEFT OUTER JOIN rt3 ON NULL RIGHT OUTER JOIN v6 ON ... 
    

    Returns 1 row (correct).
    The RIGHT OUTER JOIN prioritizes the view v6, which contains one row. Since rt0 and rt3 are joined on NULL (a no-op), the result includes all rows from v6, ignoring rt0 and rt3 due to the join condition’s logical structure.

  2. Second Query:

    SELECT COUNT(*) FROM ... WHERE (rt0.c1); 
    

    Returns 2 rows (incorrect).
    The WHERE (rt0.c1) clause introduces a filter that unexpectedly includes rows where rt0.c1 is non-zero. However, due to the interaction between the RIGHT OUTER JOIN and the view v6, SQLite’s query planner mishandles the join order, causing invalid matches.

  3. Third Query:

    SELECT COUNT(*) FROM ... WHERE (rt0.c1) IS TRUE; 
    

    Returns 0 rows (correct).
    The IS TRUE operator explicitly filters out NULL and non-boolean values, ensuring only TRUE evaluations are retained. This bypasses the planner’s flawed optimization.

Key Observations:

  • The discrepancy occurs only when a VIEW is used in the RIGHT OUTER JOIN. Replacing the view with a real table resolves the issue.
  • The problem is tied to SQLite’s RIGHT-JOIN strength reduction optimization introduced in version 3.43.0. This optimization rewrites RIGHT JOIN as LEFT JOIN internally but fails to account for edge cases involving views and WHERE clauses.
  • The WHERE (rt0.c1) clause in the second query evaluates rt0.c1 as a boolean. In SQLite, 0 is FALSE, non-zero is TRUE, and NULL is NULL. However, the optimizer incorrectly retains rows where rt0.c1 is NULL or invalid (e.g., -1e500 stored as -Inf), leading to overcounting.

Possible Causes: Query Planner Optimization Flaws and Boolean Evaluation Ambiguity

1. RIGHT-JOIN Strength Reduction Optimization Flaw

SQLite’s query planner rewrites RIGHT JOIN as LEFT JOIN for efficiency. For example:

SELECT * FROM A RIGHT JOIN B ON ... 

is rewritten as:

SELECT * FROM B LEFT JOIN A ON ... 

This works for simple cases but fails when:

  • Views are involved: Views are treated as ephemeral tables, and their materialization timing affects join reordering.
  • WHERE clauses reference outer tables: Columns from the left side of the original RIGHT JOIN (now the right side of the rewritten LEFT JOIN) are referenced in WHERE, causing the planner to misapply filters.

In the problematic query:

SELECT * FROM rt0 LEFT JOIN rt3 ON NULL RIGHT JOIN v6 ON ... WHERE (rt0.c1)

The rewritten LEFT JOIN places v6 on the left and rt0/rt3 on the right. The WHERE (rt0.c1) clause attempts to filter rt0, which is part of the right side of the LEFT JOIN. If rt0 has no matching rows, rt0.c1 is NULL, but the filter WHERE (rt0.c1) evaluates NULL as falsy. However, due to the optimization bug, the planner incorrectly retains these NULL rows.

2. Implicit Boolean Casting with WHERE (rt0.c1)

SQLite’s WHERE clause evaluates expressions as booleans using zero vs. non-zero logic:

  • 0FALSE
  • Non-zero → TRUE
  • NULLNULL (treated as falsy).

However, when rt0.c1 contains values like -1e500 (stored as -Inf), the evaluation becomes ambiguous. The WHERE (rt0.c1) clause treats -Inf as non-zero (TRUE), while WHERE (rt0.c1) IS TRUE explicitly excludes NULL and non-integer values.

3. View Materialization Timing

Views are materialized at query runtime as temporary tables. The optimizer’s join reordering interacts poorly with view materialization, especially when combined with LEFT JOIN rewrites. In contrast, real tables have fixed schemas and statistics, allowing the planner to make safer decisions.


Troubleshooting Steps, Solutions & Fixes

1. Verify SQLite Version and Apply Patches

The bug was fixed in check-in 530d10e93a5f63b7 (post-3.44.0). To resolve the issue:

  • Upgrade to SQLite 3.44.0+ with the fix included.
  • Check version:
    SELECT sqlite_version();
    
  • For older versions: Rewrite queries to avoid the optimization or use workarounds.

2. Rewrite Queries to Bypass the Optimization

Option 1: Use IS TRUE Explicitly
Replace WHERE (rt0.c1) with WHERE (rt0.c1) IS TRUE to exclude NULL and invalid values:

SELECT COUNT(*) FROM rt0 LEFT JOIN rt3 ON NULL RIGHT JOIN v6 ON ... WHERE (rt0.c1) IS TRUE;

Option 2: Materialize Views as Tables
Replace the view v6 with a real table:

DROP VIEW v6;
CREATE TABLE v6(c0 INT, c1 INT, c2 INT);
INSERT INTO v6 VALUES(0,0,0);

This eliminates view materialization timing issues.

Option 3: Disable RIGHT-JOIN Optimization
Force the planner to avoid rewriting RIGHT JOIN as LEFT JOIN using INDEXED BY or NOT INDEXED (not always practical):

SELECT COUNT(*) FROM rt0 LEFT JOIN rt3 ON NULL RIGHT JOIN v6 INDEXED BY <non-existent-index> ... 

Note: This is a hack and may not work in all cases.

3. Analyze Query Plans

Use EXPLAIN QUERY PLAN to detect join reordering:

EXPLAIN QUERY PLAN
SELECT COUNT(*) FROM rt0 LEFT JOIN rt3 ON NULL RIGHT JOIN v6 ON ... WHERE (rt0.c1);

Faulty Plan:

|--MATERIALIZE v6
|--SCAN rt0
|--SCAN rt3 LEFT-JOIN
|--SCAN v6
`--RIGHT-JOIN v6
  `--SCAN v6

Healthy Plan (after fix or workaround):

|--MATERIALIZE v6
|--SCAN v6
|--SCAN rt0
`--SCAN rt3 LEFT-JOIN

4. Validate Data Types and Constraints

Ensure columns referenced in WHERE clauses have explicit types and constraints:

CREATE TABLE rt0 (
  c0 INTEGER, 
  c1 INTEGER CHECK (c1 IS TRUE OR c1 IS FALSE),  -- Enforce boolean
  c2 INTEGER, 
  c3 INTEGER, 
  c4 INTEGER
);

This prevents invalid values like -Inf from being stored.

5. Report Edge Cases to SQLite Team

If encountering unfixed issues, provide a minimal reproducible example:

  • Simplify tables, views, and data.
  • Include EXPLAIN QUERY PLAN outputs.
  • Test on the latest SQLite version.

Final Note: The root cause was a planner optimization flaw specific to RIGHT JOIN rewrites with views and ambiguous WHERE clauses. By understanding SQLite’s boolean evaluation rules, join optimizations, and view materialization, developers can diagnose similar issues and apply targeted fixes.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *