Unexpected Query Results Using likely() in Indexed Columns with Type Affinity Issues
Issue Overview: Mismatched Affinity Handling with likely() in Indexed Queries
The core issue revolves around the interaction between SQLite’s likely()
function, column affinity enforcement (specifically REAL
), and index utilization in queries involving GLOB
or LIKE
comparisons. When a UNIQUE INDEX
is created on an expression involving likely(c0)
where c0
is a REAL
column, subsequent queries that reference this index while using GLOB
or LIKE
operators on the same column return unexpected results (0 rows instead of the expected 2 rows). This discrepancy arises due to a mismatch in how SQLite handles type affinity during index creation versus query execution.
The likely()
function, designed to provide optimization hints to the query planner, inadvertently suppresses the application of the RealAffinity
opcode when used in index definitions. The RealAffinity
opcode ensures that values stored in a REAL
column are treated as floating-point numbers, even if they could be represented as integers. When this opcode is omitted, values in the index are stored with a different storage class (e.g., INTEGER
or TEXT
), leading to type mismatches during comparisons. For example, the value 0
stored as REAL
in the table might be stored as INTEGER
in the index, causing GLOB
or LIKE
comparisons (which are text-based) to fail. The problem was introduced in a specific SQLite version (44578865fa) and resolved in a later check-in (791473bc6677c2ae).
Possible Causes: Type Affinity Suppression and Index Storage Class Mismatches
Missing RealAffinity Opcode in Indexed Expressions:
When an index is created on an expression involvinglikely(c0)
, SQLite’s bytecode compiler skips theRealAffinity
opcode that would normally enforce theREAL
affinity of columnc0
. This occurs becauselikely()
is treated as a no-op for type affinity purposes in certain contexts. As a result, the index stores values using their inherent storage class (e.g.,INTEGER
for0
or1
instead ofREAL
), even though the column is declared asREAL
.Implicit Type Conversion During Comparisons:
TheGLOB
andLIKE
operators perform text-based comparisons. When comparing aREAL
column (c0
) against an indexed value stored asINTEGER
orTEXT
, SQLite performs implicit type conversions. For instance,0
(stored asINTEGER
in the index) compared to0.0
(stored asREAL
in the table) usingGLOB
will fail becauseGLOB
requires exact text matches, and the string representations differ ("0"
vs."0.0"
).Unique Index Constraints and Collation Sequences:
TheUNIQUE INDEX
onlikely(c0)
enforces uniqueness based on the indexed values’ storage class. If two values are considered identical in the index (e.g.,0
asINTEGER
and0.0
asREAL
), the index will reject one of them. However, in the provided example, both0
and1
are inserted without triggering a uniqueness violation, indicating the index stores them as distinct values. This suggests the issue is not with index uniqueness but with how the indexed values are retrieved and compared during queries.
Troubleshooting Steps, Solutions & Fixes: Resolving Affinity Mismatches and Query Plan Issues
Step 1: Diagnose Affinity Mismatches in Indexed Columns
To confirm whether the index is storing values with the correct affinity, inspect the index’s internal storage using the PRAGMA index_xinfo
command. For the index idx
:
PRAGMA index_xinfo('idx');
Check the affinity
column for the indexed expression (likely(c0)
). If the affinity is not REAL
, this indicates a mismatch. Additionally, use EXPLAIN
to analyze the query plan and verify whether RealAffinity
is applied:
EXPLAIN SELECT count(*) FROM t0 INDEXED BY idx WHERE likely(c0) GLOB c0;
Look for the RealAffinity
opcode in the bytecode output. Its absence confirms the root cause.
Step 2: Force Affinity Enforcement in Index Definitions
Modify the index definition to explicitly enforce REAL
affinity using CAST
:
CREATE UNIQUE INDEX idx ON t0(CAST(likely(c0) AS REAL));
This ensures the index stores values with REAL
affinity, aligning with the column’s declared type. However, this workaround may not be necessary if the SQLite version includes the fix (791473bc6677c2ae).
Step 3: Standardize Comparison Operand Types
Rewrite queries to compare values with matching affinities. For example, explicitly cast both operands to TEXT
:
SELECT count(*) FROM t0 INDEXED BY idx WHERE CAST(likely(c0) AS TEXT) GLOB CAST(c0 AS TEXT);
This ensures the comparison is performed on text representations of both values, avoiding implicit conversions. However, this approach may impact performance due to the loss of index optimizations.
Step 4: Update SQLite to a Patched Version
The definitive solution is to upgrade to a SQLite version that includes the fix (791473bc6677c2ae). This patch corrects the handling of likely()
in indexed expressions, ensuring the RealAffinity
opcode is applied where appropriate. After updating, rebuild the index to apply the corrected affinity rules:
REINDEX idx;
Step 5: Validate Query Results Post-Fix
After applying the patch, re-run the original queries to verify correct results:
SELECT count(*) FROM t0 INDEXED BY idx WHERE likely(c0) GLOB c0; -- Expected: 2
Use EXPLAIN
to confirm the presence of RealAffinity
in the bytecode, ensuring proper affinity handling during query execution.
By systematically diagnosing affinity mismatches, enforcing type consistency, and applying upstream fixes, developers can resolve the unexpected query results caused by the interaction between likely()
, index definitions, and type affinity rules in SQLite.