Incorrect Query Results with Nested Joins and BETWEEN Clause Handling
Issue Overview: Inconsistent Results Between CASE-Wrapped and Direct BETWEEN in Nested Joins
The core problem arises when a query involving nested LEFT OUTER JOIN
and RIGHT OUTER JOIN
operations produces different results depending on whether a BETWEEN
clause is wrapped inside a redundant CASE
expression or used directly in the WHERE
clause. This inconsistency violates the principle of query equivalence: logically equivalent expressions should yield identical outputs.
Technical Context
Two tables are involved:
- t1 with a single text column
c6
as the primary key. - t2 with an integer column
vkey
and a text columnc12
.
Sample data:
INSERT INTO t1 VALUES ('');
INSERT INTO t2 VALUES (88, '');
Test Case 1 uses a redundant CASE
expression that always evaluates to ref_1.vkey BETWEEN NULL AND ref_0.vkey
:
SELECT 1
FROM ((t2 AS ref_0
LEFT OUTER JOIN t2 AS ref_1 ON TRUE)
RIGHT OUTER JOIN t1 AS ref_2 ON ref_1.c12 = ref_2.c6)
WHERE CASE WHEN TRUE
THEN (ref_1.vkey BETWEEN NULL AND ref_0.vkey)
ELSE (ref_1.vkey BETWEEN NULL AND ref_0.vkey)
END;
This returns 0 rows.
Test Case 2 simplifies the WHERE
clause to use BETWEEN
directly:
SELECT 1
FROM ((t2 AS ref_0
LEFT OUTER JOIN t2 AS ref_1 ON TRUE)
RIGHT OUTER JOIN t1 AS ref_2 ON ref_1.c12 = ref_2.c6)
WHERE ref_1.vkey BETWEEN NULL AND ref_0.vkey;
This returns 1 row.
The discrepancy occurs despite the logical equivalence of the WHERE
clauses. The root cause lies in SQLite’s query optimizer handling of NULL
values, join reordering, and expression evaluation in nested join contexts.
Possible Causes: Join Reordering, NULL Semantics, and Expression Flattening
Three primary factors contribute to the inconsistent results:
1. Join Reordering and Outer Join Optimization
SQLite’s query optimizer reorders joins to improve performance. However, outer joins (LEFT
or RIGHT
) impose constraints on reordering because their results depend on table order. In Test Case 1, the redundant CASE
expression may disable certain optimizations, altering the effective join order or nullability of columns.
The nested structure—(t2 LEFT JOIN t2) RIGHT JOIN t1
—creates a complex dependency chain:
- The
LEFT OUTER JOIN
betweent2 AS ref_0
andt2 AS ref_1
produces all combinations ofref_0
andref_1
rows, preserving all rows fromref_0
. - The
RIGHT OUTER JOIN
witht1 AS ref_2
preserves all rows fromt1
, matching them withref_1.c12 = ref_2.c6
.
When the CASE
expression is present, the optimizer may defer evaluating the BETWEEN
clause until after joins are resolved, whereas the direct BETWEEN
in Test Case 2 could be pushed down earlier, altering intermediate result sets.
2. NULL Handling in BETWEEN and Ternary Logic
The BETWEEN
operator in SQL evaluates to NULL
if any operand is NULL
. For example, ref_1.vkey BETWEEN NULL AND ref_0.vkey
becomes NULL
because the lower bound is NULL
. In a WHERE
clause, NULL
is treated as FALSE
, filtering out rows.
However, wrapping BETWEEN
in a CASE
expression changes the evaluation context. The CASE
expression’s result is explicitly checked for truthiness. If the BETWEEN
clause evaluates to NULL
, the CASE
returns NULL
, which the WHERE
clause interprets as FALSE
. This should behave identically to the direct BETWEEN
, but optimizer decisions may cause differences in how NULL
propagation is handled across joins.
3. Expression Flattening and Constant Folding
SQLite’s optimizer attempts to simplify expressions through constant folding and subquery flattening. The redundant CASE WHEN TRUE THEN X ELSE X END
is logically equivalent to X
, but older SQLite versions (pre-fix) might not fully flatten this structure. This could leave artifacts in the query plan, such as temporary tables or deferred evaluations, that interact poorly with outer joins.
In Test Case 1, the CASE
might prevent the optimizer from recognizing that BETWEEN
can be evaluated at a specific stage, leading to incorrect nullability assumptions for ref_0.vkey
or ref_1.vkey
.
Troubleshooting Steps, Solutions & Fixes: Query Plan Analysis and Patch Application
Step 1: Diagnose Query Plans with EXPLAIN
Use EXPLAIN
to compare the execution plans of both test cases.
Test Case 1 Query Plan:
EXPLAIN QUERY PLAN
SELECT 1 FROM ... [Test Case 1 query];
Look for operations like SCAN
, SEARCH
, or USE TEMP B-TREE
that indicate how joins are materialized. Pay attention to the order of nested loops and where the BETWEEN
clause is applied.
Test Case 2 Query Plan:
Repeat the process for the second query. Differences in join order or filter application indicate optimizer inconsistencies.
Step 2: Evaluate Intermediate Results with Subquery Isolation
Break down the nested joins into intermediate subqueries to inspect intermediate results.
Isolate the LEFT JOIN:
SELECT * FROM t2 AS ref_0 LEFT JOIN t2 AS ref_1 ON TRUE;
This produces all combinations of t2
rows. With the sample data, ref_0
and ref_1
each have one row, so the result is one row with ref_0.vkey=88
, ref_1.vkey=88
.
Apply RIGHT JOIN with t1:
SELECT * FROM
(SELECT * FROM t2 AS ref_0 LEFT JOIN t2 AS ref_1 ON TRUE) AS tmp
RIGHT JOIN t1 AS ref_2 ON ref_1.c12 = ref_2.c6;
The RIGHT JOIN
preserves the single row from t1
(with c6=''
). The join condition ref_1.c12 = ref_2.c6
evaluates to '' = ''
, which is TRUE
. Thus, the intermediate result should have one row.
Apply WHERE Clause:
The WHERE
clause ref_1.vkey BETWEEN NULL AND ref_0.vkey
evaluates to 88 BETWEEN NULL AND 88
, which is NULL
(→ FALSE
). However, Test Case 2 returns a row, suggesting the WHERE
clause is not applied as expected.
Step 3: Identify NULL Propagation in Joins
In Test Case 1, the CASE
expression might force SQLite to evaluate BETWEEN
before all joins are resolved, leading to premature filtering. For example, if ref_0.vkey
is considered NULL
during an early evaluation stage, the BETWEEN
clause would discard rows incorrectly.
Step 4: Apply the Official Patch or Upgrade SQLite
The fix (f1eae192315335d7) addresses optimizer behavior related to nested joins and expression flattening. After applying the patch or upgrading to SQLite 3.45.0+, both test cases should return identical results.
Step 5: Rewrite the Query for Compatibility
If upgrading isn’t immediate, rewrite the query to avoid relying on complex join structures or redundant CASE
expressions. For example, materialize intermediate results with CTEs:
WITH joined_data AS (
SELECT ref_0.vkey AS vkey0, ref_1.vkey AS vkey1
FROM t2 AS ref_0
LEFT JOIN t2 AS ref_1 ON TRUE
)
SELECT 1
FROM t1
LEFT JOIN joined_data ON joined_data.vkey1 = t1.c6
WHERE joined_data.vkey1 BETWEEN NULL AND joined_data.vkey0;
This forces a deterministic evaluation order, reducing optimizer ambiguity.
Final Resolution
The inconsistency stems from SQLite’s query optimizer mishandling expression evaluation in nested outer joins when redundant CASE
expressions are present. The fix ensures proper flattening of expressions and correct nullability tracking in join operations. Developers should:
- Upgrade to SQLite ≥3.45.0 with the fix.
- Avoid redundant
CASE
expressions in join-heavy queries. - Use
EXPLAIN
to verify query plans when encountering unexpected results.