SQLite INTEGER vs INT Column Type Mismatch Causing Data Retrieval Errors

Understanding Data Corruption Despite Valid Integrity Checks in SQLite with C#

Issue Overview: Mismatched Data Retrieval Due to Column Type Affinity and Application-Level Type Handling

The core issue revolves around a scenario where SQLite databases pass integrity checks (PRAGMA integrity_check) but exhibit corrupted data when queried through a C# application. Specifically, integer column values display negative numbers or incorrect magnitudes, while text columns remain unaffected. This occurs despite the same queries returning correct results in database management tools like DBeaver or the SQLite Command Line Shell (CLI).

The problem arises from a combination of three factors:

  1. SQLite’s flexible type system (manifest typing vs. static typing),
  2. Column type affinity rules, and
  3. How C# data access libraries interpret SQLite column types during data retrieval.

In the reported case, the user restructured a table by altering column definitions from INTEGER to INT, assuming equivalence. However, this triggered data retrieval errors in the C# application while leaving the database structurally valid. The root cause lies in how the C# SQLite driver maps SQLite’s type affinity to .NET data types. When columns are declared as INT, some drivers (e.g., System.Data.SQLite) may incorrectly infer a 32-bit integer type (System.Int32) instead of a 64-bit integer (System.Int64), leading to integer overflow when large values are stored. This truncation produces negative numbers due to two’s complement representation in signed integers.

Possible Causes: Column Affinity Mismatches, Driver-Specific Type Mapping, and Integer Overflow

1. SQLite Type Affinity vs. Declared Column Types

SQLite uses a dynamic type system where data types are associated with values, not columns. Columns have "type affinity," which influences how values are stored and retrieved. The affinity is determined by the declared column type:

  • INTEGER affinity is assigned to columns declared as INT, INTEGER, TINYINT, SMALLINT, MEDIUMINT, BIGINT, UNSIGNED BIG INT, INT2, or INT8.
  • However, variations in declared types (e.g., INT vs. INTEGER) can lead to inconsistencies in how applications and drivers interpret the affinity.

While SQLite itself treats INT and INTEGER identically, external tools and libraries may parse the declared type literally. For example, a C# driver might map INT to System.Int32 and INTEGER to System.Int64, even though SQLite stores both as 64-bit signed integers internally. This mismatch causes silent data truncation when values exceed 2^31 – 1 (the maximum for 32-bit integers).

2. Driver-Specific Type Inference and Data Retrieval Logic

The System.Data.SQLite library (and similar ORM tools like Dapper) often relies on the declared column type to choose a .NET type for data retrieval. If a column is declared as INT, the driver might assume 32-bit integers, leading to incorrect casts. This behavior is driver-specific and not dictated by SQLite’s core engine. For example:

// Hypothetical C# code using System.Data.SQLite
using (var reader = command.ExecuteReader())
{
    while (reader.Read())
    {
        // If the driver maps "size INT" to Int32:
        int size = reader.GetInt32(1);  // Fails for values > 2,147,483,647
    }
}

3. Integer Overflow and Signed/Unsigned Misinterpretation

Large values stored in SQLite’s INTEGER affinity columns are represented as 64-bit signed integers. When a driver truncates these to 32 bits, the upper 32 bits are discarded, effectively computing value % 2^32. For values between 2^31 and 2^32 – 1, this results in negative numbers when interpreted as signed 32-bit integers. For example:

  • A value of 4,294,967,296 (2^32) becomes 0 in 32-bit unsigned and -2,147,483,648 in 32-bit signed.
  • A value of 3,000,000,000 becomes -1,294,967,296 in signed 32-bit integers.

Troubleshooting Steps, Solutions, and Fixes: Resolving Type Mismatches and Ensuring Data Integrity

Step 1: Validate Column Declarations and Affinity in SQLite Schema

  1. Inspect Table Schemas: Use the SQLite CLI to dump the schema and verify column declarations:

    .schema Found
    

    Ensure that columns meant to store large integers are declared with INTEGER affinity (e.g., BIGINT, INTEGER). Avoid ambiguous types like INT.

  2. Check Column Affinity: Execute a query to determine the affinity of each column:

    SELECT name, type, affinity FROM pragma_table_info('Found');
    

    Confirm that numeric columns show INTEGER affinity.

Step 2: Diagnose Driver-Specific Type Mapping in C#

  1. Use Explicit Data Types in C#:
    Replace implicit type conversions with explicit 64-bit integer types:

    // Before (risky):
    int size = Convert.ToInt32(reader["size"]);
    
    // After (safe):
    long size = Convert.ToInt64(reader["size"]);
    
  2. Inspect Driver Metadata:
    Retrieve the SQLite data type as reported by the driver to debug mismatches:

    using (var reader = command.ExecuteReader())
    {
        var schemaTable = reader.GetSchemaTable();
        foreach (DataRow row in schemaTable.Rows)
        {
            var columnName = row["ColumnName"];
            var dataType = row["DataType"];
            Console.WriteLine($"{columnName}: {dataType}");
        }
    }
    

    Ensure that columns declared as INTEGER map to System.Int64, not System.Int32.

  3. Test with Raw Queries:
    Execute raw SQL queries in the C# application and compare results with the CLI:

    var command = connection.CreateCommand();
    command.CommandText = "SELECT size, typeof(size) FROM Found";
    using (var reader = command.ExecuteReader())
    {
        while (reader.Read())
        {
            Console.WriteLine($"Value: {reader.GetInt64(0)}, Type: {reader.GetString(1)}");
        }
    }
    

    If typeof(size) returns integer but the value is corrupted, the driver is misinterpreting the type.

Step 3: Correct Schema Definitions and Application Code

  1. Alter Column Affinities:
    Redefine columns with explicit INTEGER affinity to ensure drivers map them to 64-bit integers:

    CREATE TABLE Found (
        id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
        size INTEGER,  -- Use INTEGER instead of INT
        ...
    );
    
  2. Rebuild Affected Tables:
    If altering columns is not feasible (due to ALTER TABLE limitations in SQLite), recreate the table:

    -- Backup data
    CREATE TABLE Backup AS SELECT * FROM Found;
    -- Drop and recreate with correct affinity
    DROP TABLE Found;
    CREATE TABLE Found (...);
    -- Restore data
    INSERT INTO Found SELECT * FROM Backup;
    
  3. Update Data Access Code:
    Ensure all C# variables and database parameters use long (Int64) for integer columns:

    public class DriveInfo
    {
        public long Size { get; set; }  // Use long instead of int
        public long Used { get; set; }
    }
    
  4. Configure ORM/Dapper Mappings:
    Explicitly specify type mappings if using ORM tools:

    SqlMapper.AddTypeHandler(typeof(long), new LongTypeHandler());
    public class LongTypeHandler : SqlMapper.TypeHandler<long>
    {
        public override long Parse(object value)
        {
            return Convert.ToInt64(value);
        }
        public override void SetValue(IDbDataParameter parameter, long value)
        {
            parameter.Value = value;
        }
    }
    

Step 4: Prevent Future Issues with Schema and Code Standards

  1. Adopt Consistent Naming Conventions:
    Always use INTEGER for integer columns in SQLite schemas to avoid ambiguity.

  2. Implement Integration Tests:
    Create automated tests that verify data round-tripping for large values:

    [Test]
    public void StoreAndRetrieveLargeInteger()
    {
        var expectedSize = 500106784768L;
        // Insert and retrieve
        Assert.AreEqual(expectedSize, retrievedSize);
    }
    
  3. Monitor Driver Updates:
    Track updates to System.Data.SQLite or Microsoft.Data.Sqlite for fixes related to type inference.

  4. Use SQLite’s STRICT Tables (SQLite 3.37+):
    Enable strict typing to enforce column types:

    CREATE TABLE Found (... size INTEGER ANY) STRICT;
    

By addressing the interplay between SQLite’s type affinity, driver-specific type mappings, and application-level data handling, developers can resolve apparent "corruption" issues that stem from integer overflow and type misinterpretation.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *