Inconsistent Results with UNION ALL and Mixed Affinities in SQLite
Inconsistent Query Results Due to Affinity Conflicts in Compound SELECT Statements
Mixed Affinity Handling in UNION ALL Queries and Optimization-Dependent Behavior
The core issue arises when executing compound SELECT
statements (e.g., UNION ALL
) containing columns with conflicting type affinities while toggling query optimizer flags like SQLITE_QueryFlattener
. SQLite may return different numbers of rows depending on whether subquery flattening optimizations are enabled. This occurs because the data type affinity of columns in compound queries becomes indeterminate when constituent subqueries return different affinities, leading to inconsistent type conversion behavior during WHERE
clause evaluations.
Consider this simplified reproduction:
CREATE TABLE t0(c0 INT, c1 INT);
INSERT INTO t0 VALUES(10,10);
-- First query with optimization enabled
SELECT * FROM t0 JOIN (
SELECT CAST(c0 AS TEXT) AS c2 FROM t0
UNION ALL
SELECT c1 FROM t0
) WHERE 10 = c2; -- Returns 2 rows
.testctrl optimizations 0x00000001; -- Disable QueryFlattener
-- Second query after optimization disabled
SELECT * FROM t0 JOIN (
SELECT CAST(c0 AS TEXT) AS c2 FROM t0
UNION ALL
SELECT c1 FROM t0
) WHERE 10 = c2; -- Returns 1 row
The discrepancy stems from SQLite’s handling of type affinity in compound queries. When SQLITE_QueryFlattener
is active, the optimizer transforms the subquery into a form that alters how column affinities are resolved. With the optimization disabled, the raw compound query structure persists, causing different type coercion rules during the WHERE 10 = c2
comparison.
Determinants of Affinity Resolution in Compound Queries
Three primary factors contribute to this behavior:
Indeterminate Column Affinity in Compound SELECTs
SQLite documentation explicitly states that when a compound query (e.g.,UNION
,UNION ALL
,EXCEPT
,INTERSECT
) contains columns with differing affinities, the resulting column affinity becomes undefined. The database engine may use any constituent subquery’s affinity for type conversion decisions, and this choice can vary across SQLite versions, query executions, or even within different parts of the same query. In the example:- First subquery:
CAST(c0 AS TEXT)
createsc2
withTEXT
affinity - Second subquery:
c1
(integer column) givesc2
INTEGER
affinity
The finalc2
affinity isn’t guaranteed to adopt either consistently.
- First subquery:
Optimizer-Driven Query Structure Changes
TheSQLITE_QueryFlattener
optimization rewrites nested subqueries into joins where possible. When enabled:SELECT * FROM t0 JOIN (...subquery...) WHERE ...
Becomes equivalent to:
SELECT * FROM t0, (flattened_subquery) WHERE ...
This structural change impacts how SQLite’s type resolution engine processes column affinities. Flattened queries may resolve affinities earlier in the query pipeline compared to nested compound queries, altering implicit type conversions during
WHERE
clause evaluation.Context-Dependent Type Conversion Rules
SQLite performs affinity-based type conversions during comparisons using the following precedence:- If one operand has
INTEGER
affinity and the other hasTEXT
, numeric comparison occurs if theTEXT
value is a well-formed integer - When comparing
TEXT
toINTEGER
affinity columns,TEXT
values not convertible to integers are treated as less than any numeric value
In the example,WHERE 10 = c2
behaves differently depending on whetherc2
is consideredTEXT
orINTEGER
: TEXT
affinity: Compare string ’10’ to integer 10 → match (type conversion occurs)INTEGER
affinity: Compare integer 10 to integer 10 → match (no conversion needed)
However, whenc2
contains mixed affinities from a compound query, the comparison may inconsistently apply these rules across query executions.
- If one operand has
Mitigating Affinity Conflicts and Ensuring Result Consistency
Step 1: Standardize Affinities in Compound Queries
Modify all constituent SELECT
statements in compound queries to return columns with identical affinities:
-- Explicitly CAST all outputs to TEXT
SELECT * FROM t0 JOIN (
SELECT CAST(c0 AS TEXT) AS c2 FROM t0
UNION ALL
SELECT CAST(c1 AS TEXT) FROM t0 -- Now both arms have TEXT affinity
) WHERE 10 = c2;
-- Alternative: Use INTEGER affinity throughout
SELECT * FROM t0 JOIN (
SELECT c0 AS c2 FROM t0 -- Inherits INTEGER affinity from c0
UNION ALL
SELECT c1 FROM t0 -- Also INTEGER affinity
) WHERE 10 = c2;
This eliminates affinity indeterminacy by ensuring all subquery components agree on column affinity.
Step 2: Use Explicit Type Casting in Comparisons
When modifying existing queries isn’t feasible, enforce type consistency at comparison points:
SELECT * FROM t0 JOIN (
SELECT CAST(c0 AS TEXT) AS c2 FROM t0
UNION ALL
SELECT c1 FROM t0
) WHERE CAST(10 AS TEXT) = c2; -- Force TEXT comparison
Or:
SELECT * FROM t0 JOIN (
SELECT CAST(c0 AS TEXT) AS c2 FROM t0
UNION ALL
SELECT c1 FROM t0
) WHERE 10 = CAST(c2 AS INTEGER); -- Force INTEGER conversion
Explicit casts make comparison rules deterministic regardless of underlying column affinity.
Step 3: Understand Optimization Flags’ Impact on Type Handling
While not recommended for production use, temporarily modifying optimization flags helps diagnose affinity-related issues:
-- Disable all optimizations for diagnostics
.testctrl optimizations 0xFFFFFFFF
-- Run query
SELECT ...;
-- Re-enable optimizations
.testctrl optimizations 0
Key flags affecting affinity resolution:
0x00000001
(QueryFlattener): Disables subquery-to-join conversion0x00000004
(PushDown): Disables WHERE clause pushdown optimizations
Test queries with different flag combinations to identify which optimization triggers behavioral changes.
Step 4: Leverage EXPLAIN to Analyze Affinity Propagation
Use EXPLAIN
and EXPLAIN VIRTUAL MACHINE CODE
to see how optimizations affect query structure:
EXPLAIN
SELECT * FROM t0 JOIN (
SELECT CAST(c0 AS TEXT) AS c2 FROM t0
UNION ALL
SELECT c1 FROM t0
) WHERE 10 = c2;
Compare output before and after applying .testctrl optimizations 0x00000001
to observe:
- Changes in subquery flattening (look for
OpenEphemeral
opcodes indicating temporary tables) - Type conversion operations (
String8
,ToNumeric
,Affinity
opcodes)
This reveals where and how affinity conversions occur in different optimization scenarios.
Step 5: Implement Permanent Schema-Level Affinity Controls
For persistent data integrity, enforce affinities at the schema level:
-- Create view with explicit affinity
CREATE VIEW v0 AS
SELECT CAST(c0 AS TEXT) AS c2 FROM t0
UNION ALL
SELECT CAST(c1 AS TEXT) FROM t0;
-- Query using affinity-stable view
SELECT * FROM t0 JOIN v0 WHERE 10 = c2;
Views with standardized affinities provide consistent type handling regardless of optimization settings.
Step 6: Monitor Affinity-Related Edge Cases
Develop automated checks for affinity mismatches in compound queries:
-- Detect mixed affinities in UNION queries
SELECT
type AS left_affinity,
(SELECT type FROM pragma_table_info('subquery2') WHERE name = 'c2') AS right_affinity
FROM pragma_table_info('subquery1')
WHERE name = 'c2';
Regularly audit queries containing UNION
, JOIN
, or subqueries for potential affinity conflicts.
By systematically applying these techniques, developers can eliminate optimization-dependent result inconsistencies caused by SQLite’s flexible type affinity system. The ultimate solution lies in proactively managing affinities at design time rather than relying on optimization-specific behaviors.