Incorrect Query Results with Nested Joins and BETWEEN Clause Handling

Issue Overview: Inconsistent Results Between CASE-Wrapped and Direct BETWEEN in Nested Joins

The core problem arises when a query involving nested LEFT OUTER JOIN and RIGHT OUTER JOIN operations produces different results depending on whether a BETWEEN clause is wrapped inside a redundant CASE expression or used directly in the WHERE clause. This inconsistency violates the principle of query equivalence: logically equivalent expressions should yield identical outputs.

Technical Context

Two tables are involved:

  • t1 with a single text column c6 as the primary key.
  • t2 with an integer column vkey and a text column c12.

Sample data:

INSERT INTO t1 VALUES ('');  
INSERT INTO t2 VALUES (88, '');  

Test Case 1 uses a redundant CASE expression that always evaluates to ref_1.vkey BETWEEN NULL AND ref_0.vkey:

SELECT 1  
FROM ((t2 AS ref_0  
LEFT OUTER JOIN t2 AS ref_1 ON TRUE)  
RIGHT OUTER JOIN t1 AS ref_2 ON ref_1.c12 = ref_2.c6)  
WHERE CASE WHEN TRUE  
  THEN (ref_1.vkey BETWEEN NULL AND ref_0.vkey)  
  ELSE (ref_1.vkey BETWEEN NULL AND ref_0.vkey)  
END;  

This returns 0 rows.

Test Case 2 simplifies the WHERE clause to use BETWEEN directly:

SELECT 1  
FROM ((t2 AS ref_0  
LEFT OUTER JOIN t2 AS ref_1 ON TRUE)  
RIGHT OUTER JOIN t1 AS ref_2 ON ref_1.c12 = ref_2.c6)  
WHERE ref_1.vkey BETWEEN NULL AND ref_0.vkey;  

This returns 1 row.

The discrepancy occurs despite the logical equivalence of the WHERE clauses. The root cause lies in SQLite’s query optimizer handling of NULL values, join reordering, and expression evaluation in nested join contexts.

Possible Causes: Join Reordering, NULL Semantics, and Expression Flattening

Three primary factors contribute to the inconsistent results:

1. Join Reordering and Outer Join Optimization

SQLite’s query optimizer reorders joins to improve performance. However, outer joins (LEFT or RIGHT) impose constraints on reordering because their results depend on table order. In Test Case 1, the redundant CASE expression may disable certain optimizations, altering the effective join order or nullability of columns.

The nested structure—(t2 LEFT JOIN t2) RIGHT JOIN t1—creates a complex dependency chain:

  • The LEFT OUTER JOIN between t2 AS ref_0 and t2 AS ref_1 produces all combinations of ref_0 and ref_1 rows, preserving all rows from ref_0.
  • The RIGHT OUTER JOIN with t1 AS ref_2 preserves all rows from t1, matching them with ref_1.c12 = ref_2.c6.

When the CASE expression is present, the optimizer may defer evaluating the BETWEEN clause until after joins are resolved, whereas the direct BETWEEN in Test Case 2 could be pushed down earlier, altering intermediate result sets.

2. NULL Handling in BETWEEN and Ternary Logic

The BETWEEN operator in SQL evaluates to NULL if any operand is NULL. For example, ref_1.vkey BETWEEN NULL AND ref_0.vkey becomes NULL because the lower bound is NULL. In a WHERE clause, NULL is treated as FALSE, filtering out rows.

However, wrapping BETWEEN in a CASE expression changes the evaluation context. The CASE expression’s result is explicitly checked for truthiness. If the BETWEEN clause evaluates to NULL, the CASE returns NULL, which the WHERE clause interprets as FALSE. This should behave identically to the direct BETWEEN, but optimizer decisions may cause differences in how NULL propagation is handled across joins.

3. Expression Flattening and Constant Folding

SQLite’s optimizer attempts to simplify expressions through constant folding and subquery flattening. The redundant CASE WHEN TRUE THEN X ELSE X END is logically equivalent to X, but older SQLite versions (pre-fix) might not fully flatten this structure. This could leave artifacts in the query plan, such as temporary tables or deferred evaluations, that interact poorly with outer joins.

In Test Case 1, the CASE might prevent the optimizer from recognizing that BETWEEN can be evaluated at a specific stage, leading to incorrect nullability assumptions for ref_0.vkey or ref_1.vkey.

Troubleshooting Steps, Solutions & Fixes: Query Plan Analysis and Patch Application

Step 1: Diagnose Query Plans with EXPLAIN

Use EXPLAIN to compare the execution plans of both test cases.

Test Case 1 Query Plan:

EXPLAIN QUERY PLAN  
SELECT 1 FROM ... [Test Case 1 query];  

Look for operations like SCAN, SEARCH, or USE TEMP B-TREE that indicate how joins are materialized. Pay attention to the order of nested loops and where the BETWEEN clause is applied.

Test Case 2 Query Plan:
Repeat the process for the second query. Differences in join order or filter application indicate optimizer inconsistencies.

Step 2: Evaluate Intermediate Results with Subquery Isolation

Break down the nested joins into intermediate subqueries to inspect intermediate results.

Isolate the LEFT JOIN:

SELECT * FROM t2 AS ref_0 LEFT JOIN t2 AS ref_1 ON TRUE;  

This produces all combinations of t2 rows. With the sample data, ref_0 and ref_1 each have one row, so the result is one row with ref_0.vkey=88, ref_1.vkey=88.

Apply RIGHT JOIN with t1:

SELECT * FROM  
(SELECT * FROM t2 AS ref_0 LEFT JOIN t2 AS ref_1 ON TRUE) AS tmp  
RIGHT JOIN t1 AS ref_2 ON ref_1.c12 = ref_2.c6;  

The RIGHT JOIN preserves the single row from t1 (with c6=''). The join condition ref_1.c12 = ref_2.c6 evaluates to '' = '', which is TRUE. Thus, the intermediate result should have one row.

Apply WHERE Clause:
The WHERE clause ref_1.vkey BETWEEN NULL AND ref_0.vkey evaluates to 88 BETWEEN NULL AND 88, which is NULL (→ FALSE). However, Test Case 2 returns a row, suggesting the WHERE clause is not applied as expected.

Step 3: Identify NULL Propagation in Joins

In Test Case 1, the CASE expression might force SQLite to evaluate BETWEEN before all joins are resolved, leading to premature filtering. For example, if ref_0.vkey is considered NULL during an early evaluation stage, the BETWEEN clause would discard rows incorrectly.

Step 4: Apply the Official Patch or Upgrade SQLite

The fix (f1eae192315335d7) addresses optimizer behavior related to nested joins and expression flattening. After applying the patch or upgrading to SQLite 3.45.0+, both test cases should return identical results.

Step 5: Rewrite the Query for Compatibility

If upgrading isn’t immediate, rewrite the query to avoid relying on complex join structures or redundant CASE expressions. For example, materialize intermediate results with CTEs:

WITH joined_data AS (  
  SELECT ref_0.vkey AS vkey0, ref_1.vkey AS vkey1  
  FROM t2 AS ref_0  
  LEFT JOIN t2 AS ref_1 ON TRUE  
)  
SELECT 1  
FROM t1  
LEFT JOIN joined_data ON joined_data.vkey1 = t1.c6  
WHERE joined_data.vkey1 BETWEEN NULL AND joined_data.vkey0;  

This forces a deterministic evaluation order, reducing optimizer ambiguity.

Final Resolution

The inconsistency stems from SQLite’s query optimizer mishandling expression evaluation in nested outer joins when redundant CASE expressions are present. The fix ensures proper flattening of expressions and correct nullability tracking in join operations. Developers should:

  1. Upgrade to SQLite ≥3.45.0 with the fix.
  2. Avoid redundant CASE expressions in join-heavy queries.
  3. Use EXPLAIN to verify query plans when encountering unexpected results.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *