Incorrect Query Results Due to Integer-Float Rounding in RTREE Comparisons

Issue Overview: Mismatched Integer-Float Comparisons in RTREE Queries

When working with SQLite’s RTREE virtual tables and large integer values, developers may encounter unexpected query results due to implicit type conversions between integers and floating-point numbers. This issue arises when comparing a 64-bit integer rowid stored in an RTREE_i32 table against a floating-point value derived from a BLOB representation of the same integer. The root cause involves precision loss during type conversion, leading to incorrect equality evaluations in WHERE clauses.

The problem manifests in scenarios where:

  1. An RTREE_i32 table stores a rowid (c0) as a 64-bit integer with a value exceeding 2^53 (the maximum integer precisely representable in a 64-bit float).
  2. A view converts this rowid to a BLOB (textual representation) and back to a REAL (floating-point number).
  3. Direct comparisons between the original integer (c0) and the converted REAL value yield false in the SELECT list but true in the WHERE clause due to rounding errors.

Example symptoms include:

  • A query filtering with WHERE f1 returning rows even when f1 evaluates to 0 (false) in the SELECT list.
  • Subqueries using explicit IS TRUE checks correctly filtering out rows, highlighting inconsistencies in SQLite’s boolean evaluation logic.

Possible Causes: Precision Loss and Type Affinity Conflicts

1. Integer-to-Real Conversion Rounding

SQLite converts integers to REAL (64-bit floats) during comparisons when one operand is an integer and the other is a float. For integers larger than 2^53, this conversion rounds to the nearest representable float, causing:

  • Original integer: 9223372036854775807 (2^63 – 1)
  • Converted float: 9223372036854775808.0 (2^63)

Despite being consecutive integers, their floating-point representations become identical, leading to false positives in equality checks.

2. BLOB-to-REAL Casting Ambiguity

Casting a BLOB containing a numeric string to REAL introduces a second rounding operation:

  1. c0 (integer) → BLOB (text) → REAL (float)
  2. Comparison: c0 == CAST(BLOB_text AS REAL)

The double conversion (integer→text→float) exacerbates precision loss compared to direct integer→float conversion, creating mismatches between logically equivalent values.

3. RTREE_i32 Rowid Handling

Though RTREE_i32 uses 32-bit integers for spatial dimensions, its rowid (c0) remains a 64-bit integer. Inserting values via string literals (e.g., '9223372036854775807') bypasses SQLite’s usual type affinity checks, allowing storage of integers beyond 32-bit limits. This interacts unexpectedly with subsequent float conversions.

4. WHERE Clause Alias Re-Evaluation

Using column aliases like f1 in WHERE clauses forces SQLite to re-evaluate the expression in a context where type affinity rules may differ from the SELECT list. This can trigger distinct optimization paths that incorrectly prioritize float comparisons over exact integer checks.

Troubleshooting Steps, Solutions & Fixes

Step 1: Diagnose Implicit Conversion Side Effects

Verify actual data types using typeof():

SELECT 
  typeof(c0) AS c0_type,
  typeof(c4) AS c4_type,
  typeof(CAST(c4 AS REAL)) AS c4_real_type
FROM rt1, v0;

Expect:

  • c0_type: ‘integer’
  • c4_type: ‘blob’
  • c4_real_type: ‘real’

Log exact values with hexadecimal representations:

SELECT 
  c0,
  quote(c4) AS c4_hex,
  CAST(c4 AS REAL) AS c4_real
FROM rt1, v0;

Compare c0 against c4_real to identify rounding discrepancies.

Step 2: Mitigate Floating-Point Rounding

Solution A: Avoid Implicit Float Conversions

Rewrite queries to compare integers directly without REAL casting:

-- Compare c0 with BLOB interpreted as INTEGER
SELECT (c0 == CAST(c4 AS INTEGER)) AS f1 
FROM rt1, v0 
WHERE f1;

This skips the problematic float conversion but requires the BLOB to be a valid integer string.

Solution B: Use Exact Numeric Libraries

For mathematical comparisons, employ user-defined functions (UDFs) via SQLite’s C interface to handle 64-bit integers and arbitrary-precision arithmetic:

#include <sqlite3ext.h>
SQLITE_EXTENSION_INIT1

static void exactCompare(
  sqlite3_context *context,
  int argc,
  sqlite3_value **argv
) {
  int64_t a = sqlite3_value_int64(argv[0]);
  const char *b_blob = sqlite3_value_blob(argv[1]);
  int64_t b;

  // Convert BLOB to integer string
  if (sscanf(b_blob, "%lld", &b) != 1) {
    sqlite3_result_error(context, "Invalid BLOB integer", -1);
    return;
  }

  sqlite3_result_int(context, a == b);
}

// Register function during extension load
int sqlite3_extension_init(
  sqlite3 *db,
  char **pzErrMsg,
  const sqlite3_api_routines *pApi
) {
  SQLITE_EXTENSION_INIT2(pApi);
  sqlite3_create_function(db, "exact_compare", 2, 
    SQLITE_UTF8 | SQLITE_DETERMINISTIC,
    NULL, exactCompare, NULL, NULL);
  return SQLITE_OK;
}

Usage:

SELECT exact_compare(c0, c4) AS f1 
FROM rt1, v0 
WHERE f1;

Step 3: Schema Redesign for Type Consistency

Fix 1: Enforce Integer Storage in Views

Modify the view v0 to preserve integer typing:

CREATE VIEW v0(c4) AS 
SELECT c0 FROM rt1;  -- Direct integer reuse

-- Or cast via INTEGER affinity
CREATE VIEW v0(c4) AS 
SELECT CAST(c0 AS INTEGER) FROM rt1;

Fix 2: Normalize Numeric Storage

Store values as TEXT with exact precision in a separate table:

CREATE TABLE exact_numerics (
  id INTEGER PRIMARY KEY,
  value TEXT CHECK(typeof(value) = 'text')
);

INSERT INTO exact_numerics(value) 
VALUES ('9223372036854775807');

-- Compare using text representations
SELECT 
  (rt1.c0 == CAST(exact_numerics.value AS INTEGER)) AS f1
FROM rt1, exact_numerics;

Step 4: Leverage SQLite’s Fix in 3.46.0+

Upgrade to SQLite 3.46.0 or newer, which includes check-in 027e5336. This update refactors RTREE comparison logic to:

  1. Avoid converting large integers to floats during WHERE clause optimizations.
  2. Treat rowid comparisons as exact integer operations even when juxtaposed with REAL literals.

Post-upgrade validation:

EXPLAIN QUERY PLAN
SELECT (c0==CAST(c4 AS REAL)) AS f1 
FROM rt1, v0 
WHERE f1;

Inspect the output for SCAN or SEARCH operations mentioning rtree constraints, indicating proper integer handling.

Step 5: Query Plan Analysis

Use EXPLAIN to identify unintended type conversions:

EXPLAIN
SELECT (c0==CAST(c4 AS REAL)) AS f1 
FROM rt1, v0 
WHERE f1;

Look for opcodes like Cast or RealAffinity applied to c0, signaling implicit conversions. Rewrite queries or adjust indexes to eliminate these ops.

Step 6: Indexing Strategies

Create a covering index to bypass BLOB conversions:

CREATE VIEW v0(c4) AS 
SELECT c0 FROM rt1;  -- c4 now integer

SELECT (rt1.c0 == v0.c4) AS f1 
FROM rt1, v0 
WHERE f1;

Step 7: Client-Side Type Enforcement

Use strict typing in application code:

import sqlite3

conn = sqlite3.connect(':memory:')
conn.execute('CREATE VIRTUAL TABLE rt1 USING rtree_i32(c0, c1, c2, +c3 INT)')

# Insert with explicit integer type
conn.execute(
  'INSERT INTO rt1(c0, c2, c3) VALUES (?, ?, ?)',
  (9223372036854775807, 1840618558, 0.35548821863495284)
)

# Compare using Python's arbitrary-precision integers
c0 = conn.execute('SELECT c0 FROM rt1').fetchone()[0]
c4 = conn.execute('SELECT CAST(c4 AS TEXT) FROM v0').fetchone()[0]

print(c0 == int(c4))  # True, avoids SQLite's float conversion

Final Recommendations

  • Prefer integers over floats for exact comparisons involving large values.
  • Validate type affinities in views and CAST operations.
  • Isolate numeric storage from virtual tables like RTREE_i32 when precision is critical.
  • Monitor query plans for unintended type conversions using EXPLAIN.
  • Upgrade SQLite to leverage upstream fixes for edge-case rounding errors.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *