Unexpected RIGHT JOIN Results with WHERE Clause on View in SQLite
Issue Overview: RIGHT OUTER JOIN Queries with Views and WHERE Clauses Return Inconsistent Counts
The core issue arises when combining RIGHT OUTER JOIN
operations involving views and tables in SQLite, particularly when applying WHERE
clauses to filter results. Users observed that three nearly identical queries produced inconsistent counts, defying logical expectations. Here’s a breakdown of the problem:
First Query:
SELECT COUNT(*) FROM rt0 LEFT OUTER JOIN rt3 ON NULL RIGHT OUTER JOIN v6 ON ...
Returns 1 row (correct).
TheRIGHT OUTER JOIN
prioritizes the viewv6
, which contains one row. Sincert0
andrt3
are joined onNULL
(a no-op), the result includes all rows fromv6
, ignoringrt0
andrt3
due to the join condition’s logical structure.Second Query:
SELECT COUNT(*) FROM ... WHERE (rt0.c1);
Returns 2 rows (incorrect).
TheWHERE (rt0.c1)
clause introduces a filter that unexpectedly includes rows wherert0.c1
is non-zero. However, due to the interaction between theRIGHT OUTER JOIN
and the viewv6
, SQLite’s query planner mishandles the join order, causing invalid matches.Third Query:
SELECT COUNT(*) FROM ... WHERE (rt0.c1) IS TRUE;
Returns 0 rows (correct).
TheIS TRUE
operator explicitly filters outNULL
and non-boolean values, ensuring onlyTRUE
evaluations are retained. This bypasses the planner’s flawed optimization.
Key Observations:
- The discrepancy occurs only when a
VIEW
is used in theRIGHT OUTER JOIN
. Replacing the view with a real table resolves the issue. - The problem is tied to SQLite’s RIGHT-JOIN strength reduction optimization introduced in version 3.43.0. This optimization rewrites
RIGHT JOIN
asLEFT JOIN
internally but fails to account for edge cases involving views andWHERE
clauses. - The
WHERE (rt0.c1)
clause in the second query evaluatesrt0.c1
as a boolean. In SQLite,0
isFALSE
, non-zero isTRUE
, andNULL
isNULL
. However, the optimizer incorrectly retains rows wherert0.c1
isNULL
or invalid (e.g.,-1e500
stored as-Inf
), leading to overcounting.
Possible Causes: Query Planner Optimization Flaws and Boolean Evaluation Ambiguity
1. RIGHT-JOIN Strength Reduction Optimization Flaw
SQLite’s query planner rewrites RIGHT JOIN
as LEFT JOIN
for efficiency. For example:
SELECT * FROM A RIGHT JOIN B ON ...
is rewritten as:
SELECT * FROM B LEFT JOIN A ON ...
This works for simple cases but fails when:
- Views are involved: Views are treated as ephemeral tables, and their materialization timing affects join reordering.
- WHERE clauses reference outer tables: Columns from the left side of the original
RIGHT JOIN
(now the right side of the rewrittenLEFT JOIN
) are referenced inWHERE
, causing the planner to misapply filters.
In the problematic query:
SELECT * FROM rt0 LEFT JOIN rt3 ON NULL RIGHT JOIN v6 ON ... WHERE (rt0.c1)
The rewritten LEFT JOIN
places v6
on the left and rt0
/rt3
on the right. The WHERE (rt0.c1)
clause attempts to filter rt0
, which is part of the right side of the LEFT JOIN
. If rt0
has no matching rows, rt0.c1
is NULL
, but the filter WHERE (rt0.c1)
evaluates NULL
as falsy. However, due to the optimization bug, the planner incorrectly retains these NULL
rows.
2. Implicit Boolean Casting with WHERE (rt0.c1)
SQLite’s WHERE
clause evaluates expressions as booleans using zero vs. non-zero logic:
0
→FALSE
- Non-zero →
TRUE
NULL
→NULL
(treated as falsy).
However, when rt0.c1
contains values like -1e500
(stored as -Inf
), the evaluation becomes ambiguous. The WHERE (rt0.c1)
clause treats -Inf
as non-zero (TRUE
), while WHERE (rt0.c1) IS TRUE
explicitly excludes NULL
and non-integer values.
3. View Materialization Timing
Views are materialized at query runtime as temporary tables. The optimizer’s join reordering interacts poorly with view materialization, especially when combined with LEFT JOIN
rewrites. In contrast, real tables have fixed schemas and statistics, allowing the planner to make safer decisions.
Troubleshooting Steps, Solutions & Fixes
1. Verify SQLite Version and Apply Patches
The bug was fixed in check-in 530d10e93a5f63b7 (post-3.44.0). To resolve the issue:
- Upgrade to SQLite 3.44.0+ with the fix included.
- Check version:
SELECT sqlite_version();
- For older versions: Rewrite queries to avoid the optimization or use workarounds.
2. Rewrite Queries to Bypass the Optimization
Option 1: Use IS TRUE
Explicitly
Replace WHERE (rt0.c1)
with WHERE (rt0.c1) IS TRUE
to exclude NULL
and invalid values:
SELECT COUNT(*) FROM rt0 LEFT JOIN rt3 ON NULL RIGHT JOIN v6 ON ... WHERE (rt0.c1) IS TRUE;
Option 2: Materialize Views as Tables
Replace the view v6
with a real table:
DROP VIEW v6;
CREATE TABLE v6(c0 INT, c1 INT, c2 INT);
INSERT INTO v6 VALUES(0,0,0);
This eliminates view materialization timing issues.
Option 3: Disable RIGHT-JOIN Optimization
Force the planner to avoid rewriting RIGHT JOIN
as LEFT JOIN
using INDEXED BY
or NOT INDEXED
(not always practical):
SELECT COUNT(*) FROM rt0 LEFT JOIN rt3 ON NULL RIGHT JOIN v6 INDEXED BY <non-existent-index> ...
Note: This is a hack and may not work in all cases.
3. Analyze Query Plans
Use EXPLAIN QUERY PLAN
to detect join reordering:
EXPLAIN QUERY PLAN
SELECT COUNT(*) FROM rt0 LEFT JOIN rt3 ON NULL RIGHT JOIN v6 ON ... WHERE (rt0.c1);
Faulty Plan:
|--MATERIALIZE v6
|--SCAN rt0
|--SCAN rt3 LEFT-JOIN
|--SCAN v6
`--RIGHT-JOIN v6
`--SCAN v6
Healthy Plan (after fix or workaround):
|--MATERIALIZE v6
|--SCAN v6
|--SCAN rt0
`--SCAN rt3 LEFT-JOIN
4. Validate Data Types and Constraints
Ensure columns referenced in WHERE
clauses have explicit types and constraints:
CREATE TABLE rt0 (
c0 INTEGER,
c1 INTEGER CHECK (c1 IS TRUE OR c1 IS FALSE), -- Enforce boolean
c2 INTEGER,
c3 INTEGER,
c4 INTEGER
);
This prevents invalid values like -Inf
from being stored.
5. Report Edge Cases to SQLite Team
If encountering unfixed issues, provide a minimal reproducible example:
- Simplify tables, views, and data.
- Include
EXPLAIN QUERY PLAN
outputs. - Test on the latest SQLite version.
Final Note: The root cause was a planner optimization flaw specific to RIGHT JOIN
rewrites with views and ambiguous WHERE
clauses. By understanding SQLite’s boolean evaluation rules, join optimizations, and view materialization, developers can diagnose similar issues and apply targeted fixes.