SQLite 3.43 REAL Formatting Regression in Legacy MSVC Environments

REAL Number Conversion Errors in SQLite 3.43 with Pre-2010 MSVC Compilers

Issue Overview: Unsigned 64-Bit Cast Failures in Floating-Point Arithmetic

The core problem arises from SQLite 3.43’s optimized floating-point handling logic conflicting with Microsoft Visual C++ (MSVC) compilers from 2005–2010 (versions 8.0–10.0). These compilers lack proper support for converting double values to unsigned 64-bit integers (uint64_t or u64), instead truncating them to signed 64-bit integers (int64_t). This discrepancy causes critical errors in REAL-to-text conversions, as demonstrated by the pathological case where SELECT 1.1; returns 0.922337203685478 instead of the correct 1.1.

The regression stems from SQLite 3.43’s adoption of the Dekker multiplication algorithm (dekkerMul2()) and its reliance on precise 64-bit arithmetic for formatting REAL values. This algorithm depends on accurate casting of large double values (e.g., 1.1e19) to u64 during its normalization steps. Older MSVC compilers violate this assumption by using signed conversions, leading to bitmask errors (e.g., u & 0x8000000000000000 being nonzero) and miscomputation of exponents/mantissas.

The issue is exacerbated in environments where:

  1. Long double support is unavailable: The uselongdouble pragma (sqlite3_test_control(SQLITE_TESTCTRL_USELONGDOUBLE)) cannot rescue the implementation, as these compilers define long double identically to double.
  2. Legacy hardware constraints: Windows CE devices and embedded systems using these compilers cannot upgrade toolchains due to vendor lock-in or certification requirements.
  3. Precision thresholds: The sqlite3FpDecode() function’s loop condition while(rr[0] < 9.22e+17) becomes numerically unstable when truncated to signed integers, causing infinite loops or premature exits.

Possible Causes: Compiler-Specific Casting and Precision Boundaries

Three primary factors contribute to this regression:

1. Signed-Integer Casting of Double Values

MSVC 2005–2010 implement (u64)d (casting double to unsigned long long) by first converting d to a signed __int64, then reinterpreting those bits as unsigned. For values exceeding 2^63 (e.g., 1.1e19), this truncates to a negative signed integer, corrupting the high bit. Example:

double d = 1.1e19;  
u64 u = (u64)d; // Actually: u = (u64)(__int64)d;  
// If d >= 2^63, (__int64)d is negative → u's MSB (bit 63) is set!  

This violates SQLite’s assumption that u64 casts preserve magnitude.

2. Absence of Long Double Fallbacks

SQLite 3.43 introduced dekkerMul2() to improve cross-platform precision by using long double where available. However, MSVC defines long double as 64-bit (same as double), rendering this optimization ineffective. The uselongdouble flag becomes a no-op, forcing reliance on flawed 64-bit casts.

3. Precision Thresholds in FpDecode Loops

The sqlite3FpDecode() function normalizes floating-point values by repeatedly multiplying by 10 until the mantissa exceeds 9.22e+17. With incorrect casts, the computed rr[0] value stalls below this threshold, causing underflow or overflow in subsequent steps. For example:

  • Correct: 1.1 → 1.1e18 after 18 multiplies → exit loop
  • Faulty: 1.1 → 0.922e18 (due to cast errors) → loop continues indefinitely

Troubleshooting Steps and Solutions: Patching Compiler-Specific Casts

Step 1: Apply the SQLite Legacy MSVC Workaround Branch

The SQLite team provides a dedicated branch (legacy-msvc-workaround) addressing this regression. Follow these steps:

  1. Download the Patched Source:
    Visit https://sqlite.org/src/info/legacy-msvc-workaround and download the ZIP/tarball under "Downloads."

  2. Rebuild SQLite:
    Replace the sqlite3.c/sqlite3.h files in your project with the patched versions. For embedded builds:

    # Using VS2008/2010 Command Prompt  
    cl -Os -I. -DSQLITE_OMIT_LOAD_EXTENSION sqlite3.c -link -dll -out:sqlite3.dll  
    
  3. Verify the Fix:
    Test REAL formatting with and without uselongdouble:

    sqlite> .testctrl uselongdouble 0  
    sqlite> SELECT 1.1; -- Should return 1.1  
    

Step 2: Modify Floating-Point Decoding Logic

If you cannot use the pre-patched source, manually backport these changes:

A. Replace Unsigned Casts with Saturated Subtraction
In sqlite3FpDecode(), avoid direct (u64)d casts. Instead, use a threshold check:

// Before:  
u64 u = (u64)rr[0];  

// After:  
u64 u;  
if (rr[0] >= (double)0x8000000000000000) {  
  u = (u64)(rr[0] - (double)0x8000000000000000) + 0x8000000000000000;  
} else {  
  u = (u64)rr[0];  
}  

B. Adjust Precision Thresholds
Lower the loop’s exit condition to accommodate truncated values:

// Before:  
while( rr[0] < 9.22e+17 )  

// After (empirically determined):  
while( rr[0] < 8.5e+17 )  

Step 3: Disable Long Double Optimizations

Force-disable dekkerMul2() and long double logic via preprocessor flags:

#define SQLITE_OMIT_LONG_DOUBLE 1  
#define SQLITE_OMIT_D_EK_MUL    1 // If SQLite version permits  

Recompile with these flags to revert to pre-3.43 algorithms.

Step 4: Runtime Detection of Cast Integrity

Embed a runtime check during initialization:

int CanCastDoubleToU64() {  
  double d = 1.1e19;  
  u64 u = (u64)d;  
  return (u & 0x8000000000000000) != 0; // 1 if faulty  
}  

// In sqlite3_initialize():  
if (CanCastDoubleToU64()) {  
  sqlite3_config(SQLITE_CONFIG_DBL_FACTORY, ...); // Custom double handler  
}  

Step 5: Downgrade to SQLite ≤3.42.0

If patching is impractical, downgrade to SQLite 3.42.0, which lacks the problematic dekkerMul2() logic. Ensure compatibility with legacy data formats.

Long-Term Considerations for Legacy Systems

  • Compiler Shims: Wrap (u64)d casts in an inline function that uses SSE2 intrinsics (_mm_cvtpd_epu64) if available.
  • Fixed-Point Arithmetic: For devices without FPUs, pre-format REAL values as integers scaled by a power of ten.
  • Cross-Compilation: Use modern toolchains with -msoft-float to offload floating-point emulation to software.

By addressing compiler-specific casting behavior and adjusting precision thresholds, the SQLite 3.43 regression can be mitigated without requiring toolchain upgrades.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *