RIGHT JOIN ON Clause Incorrectly Propagates WHERE Constants in SQLite 3.39.0


Unexpected Query Results Due to Constant Propagation in RIGHT JOIN ON Clauses

The core issue revolves around SQLite version 3.39.0 (and earlier unreleased versions) incorrectly propagating constants from the WHERE clause into the ON conditions of RIGHT OUTER JOIN operations. This leads to inconsistent query results when joining virtual or regular tables. The problem manifests in two distinct scenarios:

  1. Empty Result vs. Non-Empty Result Discrepancy:
    A SELECT statement with a RIGHT OUTER JOIN returns no rows when the WHERE clause is omitted but returns a valid row when a WHERE condition referencing a column from the right table is added. This contradicts standard SQL behavior, where the WHERE clause should filter results after joins are processed.

  2. Incorrect Constant Propagation in ON Clauses:
    SQLite’s query optimizer incorrectly substitutes values from the WHERE clause into the ON conditions of RIGHT JOIN operations. This substitution alters the join logic and produces invalid comparisons, especially when virtual tables (e.g., rtree_i32) are involved.

Example Scenario:
Consider a virtual table rt0 with columns c1 and c2, and two empty tables t0 and t1:

CREATE VIRTUAL TABLE rt0 USING rtree_i32(c0, c1, c2);
CREATE TABLE t0 (c0);
CREATE TABLE t1 (c0);
INSERT INTO rt0 VALUES (0, 0, 0);

The first query returns an empty result:

SELECT * FROM t1 
JOIN t0 ON rt0.c1 > t0.c0 
RIGHT OUTER JOIN rt0; -- {} Empty result

The second query, which adds a WHERE rt0.c1 = 0 clause, incorrectly returns a row:

SELECT * FROM t1 
JOIN t0 ON rt0.c1 > t0.c0 
RIGHT OUTER JOIN rt0 
WHERE rt0.c1 = 0; -- ||0|0|0

The WHERE clause’s constant (0) is improperly pushed into the ON clause of the RIGHT JOIN, changing the join condition from rt0.c1 > t0.c0 to 0 > t0.c0. Since t0 is empty, this comparison becomes irrelevant, and the join erroneously returns a row from rt0.


Root Causes of Constant Propagation and Join Misbehavior

1. Optimizer Over-Aggressiveness in Constant Substitution

SQLite’s query optimizer attempts to simplify expressions by substituting known constants early in the query compilation process. This is normally beneficial for performance but becomes problematic when constants from the WHERE clause are propagated into ON conditions of RIGHT JOIN operations. The substitution occurs before the join is processed, leading to invalid comparisons.

Why This Affects RIGHT JOINs Specifically:
In a RIGHT OUTER JOIN, the right table’s rows are preserved regardless of whether they match the left side. The optimizer’s substitution logic fails to account for the fact that WHERE clauses referencing the right table’s columns should not influence the ON conditions of earlier joins. This violates the SQL standard’s scoping rules for joins and filters.

2. Interaction with Virtual Tables

Virtual tables like rtree_i32 (used for spatial indexing) have custom query optimization logic. The bug is more pronounced in such tables due to their internal handling of constraints. When a virtual table is the right operand of a RIGHT JOIN, the optimizer’s constant propagation interacts with the virtual table’s internal query planner, causing invalid substitutions in the join conditions.

Case Study:
In the example above, the rt0.c1 = 0 condition in the WHERE clause is pushed into the ON clause of the RIGHT JOIN. The virtual table’s internal logic may treat this as a direct constraint, bypassing the intended join logic. This results in the ON condition rt0.c1 > t0.c0 being rewritten as 0 > t0.c0, which is a nonsensical comparison when t0 is empty.

3. Incomplete Fix for Prior Bug

This issue is a regression caused by an incomplete fix for a previous bug (forum post 8e4c352937e82929). The original fix addressed constant propagation in simpler join scenarios but failed to account for edge cases involving RIGHT JOIN, virtual tables, or multi-level joins. The revised fix in check-in fb0a23b6789da8e9 resolves this by restricting constant propagation in RIGHT JOIN contexts.


Resolving the Constant Propagation Bug in RIGHT JOINs

Step 1: Verify the SQLite Version and Apply the Fix

  1. Check Current Version:
    Run SELECT sqlite_version(); to confirm the SQLite version. If the result is 3.39.0 or earlier, the bug is present.
  2. Update to a Fixed Version:
    The fix was introduced in check-in fb0a23b6789da8e9. Obtain the latest trunk version of SQLite from the official repository or wait for an official release containing the fix.

Step 2: Rewrite Queries to Avoid Constant Propagation

If upgrading is not feasible, modify queries to prevent the optimizer from propagating WHERE constants into ON clauses:

Workaround 1: Use the Unary + Operator
Add a no-op unary + operator to the WHERE clause to prevent constant substitution:

-- Original (buggy):
SELECT * FROM t1 
JOIN t0 ON rt0.c1 > t0.c0 
RIGHT OUTER JOIN rt0 
WHERE rt0.c1 = 0;

-- Fixed:
SELECT * FROM t1 
JOIN t0 ON rt0.c1 > t0.c0 
RIGHT OUTER JOIN rt0 
WHERE +rt0.c1 = 0; -- Unary + blocks propagation

Workaround 2: Restructure Joins
Avoid RIGHT JOIN syntax, which is less common in SQLite. Use LEFT JOIN with reordered tables:

-- Original:
SELECT * FROM t1 
JOIN t0 ON rt0.c1 > t0.c0 
RIGHT OUTER JOIN rt0 
WHERE rt0.c1 = 0;

-- Restructured with LEFT JOIN:
SELECT * FROM rt0 
LEFT JOIN (t1 JOIN t0 ON rt0.c1 > t0.c0) 
WHERE rt0.c1 = 0;

Step 3: Validate Query Logic with Test Cases

Use simplified test cases to confirm the fix:

Test Case 1 (Regular Tables):

CREATE TABLE t1(a INT);
CREATE TABLE t2(b INT);
CREATE TABLE t3(c INTEGER PRIMARY KEY, d INT);
INSERT INTO t3 VALUES(0, 0);

-- Both queries should return no rows:
SELECT * FROM t1 JOIN t2 ON d > b RIGHT JOIN t3 ON true WHERE +d = 0;
SELECT * FROM t1 JOIN t2 ON d > b RIGHT JOIN t3 ON true WHERE d = 0;

Test Case 2 (Virtual Tables):

CREATE VIRTUAL TABLE rt0 USING rtree_i32(c0, c1, c2);
INSERT INTO rt0 VALUES(0, 0, 0);

-- Should return no rows (t1 and t0 are empty):
SELECT * FROM t1 
JOIN t0 ON rt0.c1 > t0.c0 
RIGHT OUTER JOIN rt0 
WHERE +rt0.c1 = 0;

Step 4: Monitor Query Performance and Plan

After applying fixes, use EXPLAIN QUERY PLAN to verify that constants are not propagated into ON clauses:

EXPLAIN QUERY PLAN
SELECT * FROM t1 
JOIN t0 ON rt0.c1 > t0.c0 
RIGHT OUTER JOIN rt0 
WHERE rt0.c1 = 0;

The output should show that the WHERE clause is applied after the join, not merged into the ON condition.


Key Takeaways

  • Constant Propagation Limitations: SQLite’s optimizer aggressively substitutes constants but does not always respect join scoping rules. This can corrupt RIGHT JOIN logic.
  • Virtual Table Considerations: Custom virtual table implementations may exacerbate optimizer bugs due to their unique constraint handling.
  • Defensive Query Design: Use the unary + operator or avoid RIGHT JOIN syntax in favor of LEFT JOIN to mitigate similar issues.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *