SQLite INTEGER vs INT Column Type Mismatch Causing Data Retrieval Errors
Understanding Data Corruption Despite Valid Integrity Checks in SQLite with C#
Issue Overview: Mismatched Data Retrieval Due to Column Type Affinity and Application-Level Type Handling
The core issue revolves around a scenario where SQLite databases pass integrity checks (PRAGMA integrity_check
) but exhibit corrupted data when queried through a C# application. Specifically, integer column values display negative numbers or incorrect magnitudes, while text columns remain unaffected. This occurs despite the same queries returning correct results in database management tools like DBeaver or the SQLite Command Line Shell (CLI).
The problem arises from a combination of three factors:
- SQLite’s flexible type system (manifest typing vs. static typing),
- Column type affinity rules, and
- How C# data access libraries interpret SQLite column types during data retrieval.
In the reported case, the user restructured a table by altering column definitions from INTEGER
to INT
, assuming equivalence. However, this triggered data retrieval errors in the C# application while leaving the database structurally valid. The root cause lies in how the C# SQLite driver maps SQLite’s type affinity to .NET data types. When columns are declared as INT
, some drivers (e.g., System.Data.SQLite) may incorrectly infer a 32-bit integer type (System.Int32
) instead of a 64-bit integer (System.Int64
), leading to integer overflow when large values are stored. This truncation produces negative numbers due to two’s complement representation in signed integers.
Possible Causes: Column Affinity Mismatches, Driver-Specific Type Mapping, and Integer Overflow
1. SQLite Type Affinity vs. Declared Column Types
SQLite uses a dynamic type system where data types are associated with values, not columns. Columns have "type affinity," which influences how values are stored and retrieved. The affinity is determined by the declared column type:
INTEGER
affinity is assigned to columns declared asINT
,INTEGER
,TINYINT
,SMALLINT
,MEDIUMINT
,BIGINT
,UNSIGNED BIG INT
,INT2
, orINT8
.- However, variations in declared types (e.g.,
INT
vs.INTEGER
) can lead to inconsistencies in how applications and drivers interpret the affinity.
While SQLite itself treats INT
and INTEGER
identically, external tools and libraries may parse the declared type literally. For example, a C# driver might map INT
to System.Int32
and INTEGER
to System.Int64
, even though SQLite stores both as 64-bit signed integers internally. This mismatch causes silent data truncation when values exceed 2^31 – 1 (the maximum for 32-bit integers).
2. Driver-Specific Type Inference and Data Retrieval Logic
The System.Data.SQLite library (and similar ORM tools like Dapper) often relies on the declared column type to choose a .NET type for data retrieval. If a column is declared as INT
, the driver might assume 32-bit integers, leading to incorrect casts. This behavior is driver-specific and not dictated by SQLite’s core engine. For example:
// Hypothetical C# code using System.Data.SQLite
using (var reader = command.ExecuteReader())
{
while (reader.Read())
{
// If the driver maps "size INT" to Int32:
int size = reader.GetInt32(1); // Fails for values > 2,147,483,647
}
}
3. Integer Overflow and Signed/Unsigned Misinterpretation
Large values stored in SQLite’s INTEGER
affinity columns are represented as 64-bit signed integers. When a driver truncates these to 32 bits, the upper 32 bits are discarded, effectively computing value % 2^32
. For values between 2^31 and 2^32 – 1, this results in negative numbers when interpreted as signed 32-bit integers. For example:
- A value of 4,294,967,296 (2^32) becomes 0 in 32-bit unsigned and -2,147,483,648 in 32-bit signed.
- A value of 3,000,000,000 becomes -1,294,967,296 in signed 32-bit integers.
Troubleshooting Steps, Solutions, and Fixes: Resolving Type Mismatches and Ensuring Data Integrity
Step 1: Validate Column Declarations and Affinity in SQLite Schema
Inspect Table Schemas: Use the SQLite CLI to dump the schema and verify column declarations:
.schema Found
Ensure that columns meant to store large integers are declared with
INTEGER
affinity (e.g.,BIGINT
,INTEGER
). Avoid ambiguous types likeINT
.Check Column Affinity: Execute a query to determine the affinity of each column:
SELECT name, type, affinity FROM pragma_table_info('Found');
Confirm that numeric columns show
INTEGER
affinity.
Step 2: Diagnose Driver-Specific Type Mapping in C#
Use Explicit Data Types in C#:
Replace implicit type conversions with explicit 64-bit integer types:// Before (risky): int size = Convert.ToInt32(reader["size"]); // After (safe): long size = Convert.ToInt64(reader["size"]);
Inspect Driver Metadata:
Retrieve the SQLite data type as reported by the driver to debug mismatches:using (var reader = command.ExecuteReader()) { var schemaTable = reader.GetSchemaTable(); foreach (DataRow row in schemaTable.Rows) { var columnName = row["ColumnName"]; var dataType = row["DataType"]; Console.WriteLine($"{columnName}: {dataType}"); } }
Ensure that columns declared as
INTEGER
map toSystem.Int64
, notSystem.Int32
.Test with Raw Queries:
Execute raw SQL queries in the C# application and compare results with the CLI:var command = connection.CreateCommand(); command.CommandText = "SELECT size, typeof(size) FROM Found"; using (var reader = command.ExecuteReader()) { while (reader.Read()) { Console.WriteLine($"Value: {reader.GetInt64(0)}, Type: {reader.GetString(1)}"); } }
If
typeof(size)
returnsinteger
but the value is corrupted, the driver is misinterpreting the type.
Step 3: Correct Schema Definitions and Application Code
Alter Column Affinities:
Redefine columns with explicitINTEGER
affinity to ensure drivers map them to 64-bit integers:CREATE TABLE Found ( id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT, size INTEGER, -- Use INTEGER instead of INT ... );
Rebuild Affected Tables:
If altering columns is not feasible (due toALTER TABLE
limitations in SQLite), recreate the table:-- Backup data CREATE TABLE Backup AS SELECT * FROM Found; -- Drop and recreate with correct affinity DROP TABLE Found; CREATE TABLE Found (...); -- Restore data INSERT INTO Found SELECT * FROM Backup;
Update Data Access Code:
Ensure all C# variables and database parameters uselong
(Int64) for integer columns:public class DriveInfo { public long Size { get; set; } // Use long instead of int public long Used { get; set; } }
Configure ORM/Dapper Mappings:
Explicitly specify type mappings if using ORM tools:SqlMapper.AddTypeHandler(typeof(long), new LongTypeHandler()); public class LongTypeHandler : SqlMapper.TypeHandler<long> { public override long Parse(object value) { return Convert.ToInt64(value); } public override void SetValue(IDbDataParameter parameter, long value) { parameter.Value = value; } }
Step 4: Prevent Future Issues with Schema and Code Standards
Adopt Consistent Naming Conventions:
Always useINTEGER
for integer columns in SQLite schemas to avoid ambiguity.Implement Integration Tests:
Create automated tests that verify data round-tripping for large values:[Test] public void StoreAndRetrieveLargeInteger() { var expectedSize = 500106784768L; // Insert and retrieve Assert.AreEqual(expectedSize, retrievedSize); }
Monitor Driver Updates:
Track updates to System.Data.SQLite or Microsoft.Data.Sqlite for fixes related to type inference.Use SQLite’s STRICT Tables (SQLite 3.37+):
Enable strict typing to enforce column types:CREATE TABLE Found (... size INTEGER ANY) STRICT;
By addressing the interplay between SQLite’s type affinity, driver-specific type mappings, and application-level data handling, developers can resolve apparent "corruption" issues that stem from integer overflow and type misinterpretation.