Unexpected JOIN Results with LIKELIHOOD Function in SQLite 3.28-3.34
JOIN Condition Evaluation Failure in SQLite Versions 3.28-3.34 When Using LIKELIHOOD with AND Logic
JOIN Clause Truth Miscalculation with LIKELIHOOD and AND Operator
Core Problem Statement
A critical defect exists in SQLite versions 3.28.0 through 3.34.0 where JOIN conditions containing the LIKELIHOOD() function combined with logical AND operators produce incorrect results. This manifests when:
- The LIKELIHOOD() wrapper surrounds a false predicate
- A secondary condition in an AND clause also evaluates as false
- Both conditions are required to be true for row matching
The failure occurs specifically in the query optimizer’s handling of likelihood hints during JOIN evaluation, where the system incorrectly allows row matching despite both logical conditions failing. This contradicts standard SQL logic where AND-connected false conditions should prevent row inclusion in the result set.
Version-Specific Query Optimizer Flaw in Predicate Handling
The root cause resides in SQLite’s bytecode generation for JOIN operations when using the LIKELIHOOD() function to influence query planner decisions. Key technical factors include:
Incorrect Short-Circuit Evaluation: The LIKELIHOOD() function’s intended purpose as a query planner hint about predicate probability (not a runtime value modifier) was improperly implemented in affected versions. When combined with AND logic, the optimizer would sometimes skip full evaluation of subsequent conditions.
Type Affinity Mismanagement: The v0.v3 column’s comparison with string literal ‘111’ versus numeric 111 creates implicit type conversion requirements. Older versions failed to properly handle these conversions when likelihood hints were present, leading to incorrect truth evaluations.
Bytecode Optimization Defect: Check-in 2363a14ca723c034 in the SQLite source tree reveals the flaw stemmed from how the code generator handled OP_IfNot jumps when likelihood-modified predicates appeared in compound expressions. The bug caused premature termination of condition evaluation chains.
WITHOUT ROWID Table Interactions: The use of WITHOUT ROWID tables (which use clustered indexes) combined with likelihood hints exacerbated the issue due to differences in index search patterns compared to standard rowid tables.
Comprehensive Resolution Strategy for Defective JOIN Evaluations
Step 1: Immediate Version Verification
Execute SELECT sqlite_version();
to confirm SQLite version:
- Affected: 3.28.0 ≤ version ≤ 3.34.0
- Patched: ≥ 3.36.0
Step 2: Query Pattern Remediation (Forced Version Retention)
When unable to upgrade immediately, rewrite affected JOIN clauses using:
-- Original defective pattern
SELECT * FROM v4 JOIN v0
ON LIKELIHOOD(unstable_condition, 0.5) AND critical_condition;
-- Remediated pattern using nested CASE
SELECT * FROM v4 JOIN v0
ON CASE WHEN critical_condition THEN LIKELIHOOD(unstable_condition, 0.5) ELSE FALSE END;
Step 3: Type Affinity Enforcement
Explicitly cast values to prevent hidden type conversion errors:
SELECT * FROM v4 JOIN v0
ON LIKELIHOOD(CAST(v0.v3 AS TEXT) = CAST(v0.v1 AS TEXT), 0.5)
AND CAST(v0.v3 AS INTEGER) = 111; -- Explicit type handling
Step 4: EXPLAIN Analysis for Bytecode Inspection
Generate and compare execution plans between versions:
EXPLAIN
SELECT * FROM v4 JOIN v0
ON LIKELIHOOD( v0.v3 = v0.v1, 0.5 ) AND v0.v3 = '111';
Key differences in patched versions will show:
- Additional OP_ApplyType operations for type enforcement
- Modified OP_IfNot jump targets ensuring both conditions evaluate
- Proper use of OP_SeekEnd for WITHOUT ROWID index termination
Step 5: Progressive Condition Testing
Isolate condition components through iterative testing:
-- Test LIKELIHOUD component alone
SELECT 1 WHERE LIKELIHOOD('333' = '111', 0.5); -- Returns 0 rows
-- Test secondary condition alone
SELECT 1 WHERE '333' = '111'; -- Returns 0 rows
-- Test combined AND logic
SELECT 1 WHERE LIKELIHOOD(0, 0.5) AND 0; -- Patched: 0 rows, Defective: 1 row
Step 6: Schema Modification Workarounds
For critical systems requiring retention of affected versions, implement schema changes:
-- Convert WITHOUT ROWID tables to standard tables
CREATE TABLE v0_new (v1 PRIMARY KEY, v2, v3);
INSERT INTO v0_new SELECT * FROM v0;
DROP TABLE v0;
ALTER TABLE v0_new RENAME TO v0;
-- Add computed column mirroring the JOIN condition
ALTER TABLE v0 ADD COLUMN join_flag GENERATED ALWAYS AS (
LIKELIHOOD(v3 = v1, 0.5) AND (v3 = '111')
);
SELECT * FROM v4 JOIN v0 ON join_flag;
Step 7: Query Planner Hint Alternatives
Replace LIKELIHOOD() with alternative optimizer directives:
-- Use INDEXED BY instead of probability hints
CREATE INDEX v0_v3_v1 ON v0(v3, v1);
SELECT * FROM v4 JOIN v0 INDEXED BY v0_v3_v1
ON v3 = v1 AND v3 = '111';
Step 8: Defensive Programming Practices
Implement runtime checks for version-specific defects:
-- At application startup
SELECT
CASE WHEN sqlite_version() BETWEEN '3.28.0' AND '3.34.0'
THEN RAISE(ABORT, 'Unsupported SQLite version with JOIN defect')
END;
Step 9: Regression Test Construction
Develop comprehensive test cases to validate JOIN condition handling:
-- Setup
CREATE TABLE test_a (id TEXT PRIMARY KEY) WITHOUT ROWID;
INSERT INTO test_a VALUES ('foo');
CREATE TABLE test_b (id TEXT PRIMARY KEY) WITHOUT ROWID;
INSERT INTO test_b VALUES ('bar');
-- Defect trigger test
SELECT COUNT(*) FROM test_a JOIN test_b
ON LIKELIHOOD(test_a.id = test_b.id, 0.1)
AND test_a.id = 'baz'; -- Should always return 0
-- Expected results assertion
SELECT CASE WHEN COUNT(*) = 0
THEN 'PASS' ELSE 'FAIL' END AS test_result
FROM (/* above query */);
Final Recommendation: Upgrade to SQLite ≥3.36.0 where practical. For legacy system support, combine explicit type casting, query restructuring, and defensive schema design to mitigate the defect. Implement version detection routines and regression test suites to prevent recurrence.