Converting TEXT Primary Keys to INTEGER in SQLite with Minimal Downtime
Schema Constraints and Primary Key Conversion Challenges
Issue Overview
The core challenge involves migrating existing TEXT-based primary keys to INTEGER-based primary keys in SQLite while preserving referential integrity across dependent tables. The primary obstacles stem from SQLite’s schema modification limitations, particularly the inability to directly alter column types or redefine primary keys without recreating tables. This becomes critical when tables have foreign key relationships that cascade updates or deletions. For example, a parent table Customers
with a TEXT primary key customer_id
may be referenced by child tables like Orders
through foreign keys. Converting customer_id
to an INTEGER type requires synchronizing all foreign key references across the schema, which involves generating new sequential integer keys, updating dependent tables, and ensuring that constraints like ON UPDATE CASCADE
are preserved or reimplemented. The presence of 358,000 records amplifies the risk of data corruption, index fragmentation, and transaction timeouts if the migration process is not optimized.
SQLite’s ALTER TABLE
command only supports limited operations: renaming tables or columns and adding columns. Redefining a primary key column’s data type or altering its auto-increment properties necessitates a full table rebuild. This process becomes exponentially complex when foreign key constraints exist, as SQLite does not automatically propagate schema changes to dependent tables. Additionally, the use of AUTOINCREMENT
requires careful handling because it imposes stricter requirements on integer primary keys compared to implicitly managed ROWID
values. The absence of a built-in mechanism to generate sequential integers for existing rows while maintaining relational consistency adds further complexity, especially when the original TEXT keys must be mapped to new integer values without gaps or duplicates.
Foreign Key Dependencies and Data Migration Risks
The primary risks during migration include broken foreign key relationships, data type mismatches, and loss of constraint enforcement. For instance, if the Orders
table references Customers.customer_id
as a TEXT column, converting customer_id
to INTEGER without updating Orders.customer_id
will result in type incompatibility errors. Even if the foreign key columns are also converted to INTEGER, the new integer keys must match the original TEXT keys’ logical relationships. This requires a deterministic mapping strategy, such as using the ROW_NUMBER()
window function to assign sequential integers based on the original TEXT key’s insertion order or another stable criterion. However, window functions in SQLite are only available in version 3.25.0 and later, which may necessitate version compatibility checks.
Another risk arises from transaction management. Migrating 358,000 records in a single transaction may exceed SQLite’s memory limits or cause excessive lock contention. Splitting the migration into batched transactions can mitigate this, but it introduces the risk of partial failures if the process is interrupted. Furthermore, indexes on the original TEXT primary keys must be rebuilt for the new INTEGER columns, which can temporarily degrade query performance. Triggers or views that reference the original primary key column must also be updated to reflect the new schema, adding to the operational overhead.
Step-by-Step Migration Strategy with Data Integrity Assurance
Phase 1: Schema Analysis and Dependency Mapping
Begin by generating a comprehensive list of all tables, foreign key relationships, indexes, and triggers that interact with the TEXT primary key columns. Use the following query to identify foreign key dependencies:
SELECT
m.name AS table_name,
p."from" AS foreign_key_column,
p."table" AS referenced_table,
p."to" AS referenced_column
FROM sqlite_master AS m
JOIN pragma_foreign_key_list(m.name) AS p
WHERE m.type = 'table';
This reveals which tables reference the TEXT primary key and must be updated. For each affected table, document the columns involved in foreign key constraints and their referenced tables.
Phase 2: Create a Mapping Table for Key Conversion
Generate a mapping table that associates each original TEXT primary key with a new INTEGER value. If the original keys are unique and non-null, use ROW_NUMBER()
to assign sequential integers:
CREATE TABLE customer_id_mapping AS
SELECT
customer_id AS original_key,
ROW_NUMBER() OVER (ORDER BY customer_id) AS new_key
FROM Customers;
Add an index on original_key
to accelerate lookups during foreign key updates:
CREATE INDEX idx_customer_id_mapping ON customer_id_mapping(original_key);
Phase 3: Migrate Parent Tables with Temporary Schema Rebuild
For the parent table (e.g., Customers
), create a new table with the desired INTEGER primary key and auto-increment property:
CREATE TABLE Customers_new (
customer_id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
email TEXT
);
Populate this table using the mapping table to preserve the original data while assigning new integer keys:
INSERT INTO Customers_new (customer_id, name, email)
SELECT
m.new_key,
c.name,
c.email
FROM Customers AS c
JOIN customer_id_mapping AS m ON c.customer_id = m.original_key;
Phase 4: Update Foreign Key Columns in Child Tables
For each child table (e.g., Orders
), add a temporary column to hold the new integer foreign key:
ALTER TABLE Orders ADD COLUMN customer_id_new INTEGER;
Populate this column by joining with the mapping table:
UPDATE Orders AS o
SET customer_id_new = m.new_key
FROM customer_id_mapping AS m
WHERE o.customer_id = m.original_key;
After validating the data, drop the original TEXT foreign key column and rename the new column:
ALTER TABLE Orders DROP COLUMN customer_id;
ALTER TABLE Orders RENAME COLUMN customer_id_new TO customer_id;
Recreate foreign key constraints to reference the new INTEGER primary key:
PRAGMA foreign_keys = OFF; -- Disable FK enforcement during schema changes
CREATE TABLE Orders_new (
order_id INTEGER PRIMARY KEY,
customer_id INTEGER,
order_date TEXT,
FOREIGN KEY (customer_id) REFERENCES Customers_new(customer_id)
);
INSERT INTO Orders_new SELECT * FROM Orders;
DROP TABLE Orders;
ALTER TABLE Orders_new RENAME TO Orders;
PRAGMA foreign_keys = ON;
Phase 5: Rebuild Indexes and Restore Triggers
Recreate indexes on the new INTEGER primary key columns to optimize query performance:
CREATE INDEX idx_orders_customer_id ON Orders(customer_id);
Export trigger definitions from the original schema using sqlite_master
:
SELECT sql FROM sqlite_master WHERE type = 'trigger' AND tbl_name = 'Customers';
Modify these definitions to reference the new column names and reapply them.
Phase 6: Final Validation and Cleanup
Verify data integrity by checking that all foreign key relationships are consistent:
PRAGMA foreign_key_check;
Ensure that auto-increment behavior works as expected by inserting a test row into the migrated parent table:
INSERT INTO Customers_new (name, email) VALUES ('Test Name', '[email protected]');
Confirm that the new customer_id
is generated correctly. Finally, remove the mapping table and backup the original schema:
DROP TABLE customer_id_mapping;
This approach minimizes downtime by leveraging transactional batches and ensures data consistency through systematic key mapping. For large datasets, consider splitting the UPDATE
and INSERT
operations into smaller transactions using BEGIN IMMEDIATE
and COMMIT
to avoid locking issues. Always perform the migration on a copy of the database first to validate the process before executing it in production.