Recovering Data from Generated Columns in SQLite Ends Up in Lost_and_Found Table
Issue Overview: Recovery Process Fails to Reconstruct Generated Columns
When attempting to recover a SQLite database containing tables with generated columns using the .recover
command, the reconstructed data appears in a lost_and_found table rather than restoring the original table structure with computed values. This occurs because generated columns (both VIRTUAL and STORED types) present unique challenges during database recovery operations. The core issue stems from SQLite’s recovery mechanism treating generated columns as regular persistent data columns rather than understanding their computational nature.
In the demonstrated scenario, the t1 table contains:
- Two persistent columns (a, b, c)
- Two generated columns:
- d: Virtual column calculating
a*abs(b)
- e: Stored column computing
substr(c,b,b+1)
- d: Virtual column calculating
After recovery attempts, the database engine creates a lost_and_found table containing raw stored values from generated columns but fails to:
- Rebuild the original table schema with column generation logic
- Recalculate virtual column values during data insertion
- Preserve the relationship between base columns and their generated counterparts
This results in permanent data loss for virtual columns and fragmented recovery of stored columns without their computational context. The recovery process essentially converts generated columns into static data fields, stripping away their defining expressions and generation types.
Possible Causes: Schema Disconnect in Recovery Workflow
1. Metadata Decoupling During Recovery
SQLite’s .recover
command operates at the storage layer rather than the schema layer. While it successfully extracts raw database page contents, it cannot:
- Reconstruct column generation expressions
- Differentiate between stored and virtual generated columns
- Preserve constraints or computed relationships
The recovery process treats all columns as regular data columns, ignoring the GENERATED ALWAYS AS
clauses in the original table definition. This metadata-storage decoupling is particularly problematic for generated columns because their values exist in different states:
- Virtual columns: Never stored (computed on-the-fly)
- Stored columns: Persisted but dependent on other columns
2. Page-Level Data Extraction Limitations
The recovery mechanism works by scanning database pages for recognizable data patterns. For generated columns:
- Virtual columns: Leave no persistent storage footprint, making them unrecoverable through page scanning
- Stored columns: Contain persisted values but without their generation context
In the sample recovery output:
INSERT INTO "lost_and_found" VALUES(2, 2, 4, 1, NULL, 2, 'Hello SQLite3', 'ell')
The NULL value represents the missing virtual column (d), while the stored column (e) retains its last computed value (‘ell’) but loses the underlying substr()
expression.
3. Insertion Logic Shortcomings
The recovery script generates standard INSERT statements that:
- Attempt to insert raw values directly
- Cannot execute expressions during data restoration
- Lack context about column generation rules
This results in two critical failures:
- Virtual columns receive NULL values (as they require computation during insertion)
- Stored columns retain their last computed value but lose regeneration capability
Troubleshooting Steps: Manual Schema Reconstruction with Value Regeneration
1. Full Schema Extraction and Analysis
Recover the original table definition from available sources:
CREATE TABLE t1(
a INTEGER PRIMARY KEY,
b INT,
c TEXT,
d INT GENERATED ALWAYS AS (a*abs(b)) VIRTUAL,
e TEXT GENERATED ALWAYS AS (substr(c,b,b+1)) STORED
);
Preserve this schema separately before attempting any recovery operations. For databases where the original schema is unavailable:
Step 1.1: Extract schema fragments from WAL files
sqlite3 damaged.db .schema 2>/dev/null > recovered_schema.sql
Step 1.2: Search backup files for CREATE TABLE
statements
grep -i 'CREATE TABLE t1' *.bak *.sql
2. Controlled Data Migration Process
Step 2.1: Create new table with original generated columns
CREATE TABLE recovered_t1(
a INTEGER PRIMARY KEY,
b INT,
c TEXT,
d INT GENERATED ALWAYS AS (a*abs(b)) VIRTUAL,
e TEXT GENERATED ALWAYS AS (substr(c,b,b+1)) STORED
);
Step 2.2: Import base columns from lost_and_found
INSERT INTO recovered_t1(a, b, c)
SELECT c1, c2, c3 FROM lost_and_found;
Step 2.3: Regenerate stored columns
-- For SQLite 3.35.5+
UPDATE recovered_t1 SET e = substr(c,b,b+1);
3. Virtual Column Value Restoration
Since virtual columns leave no persistent data trace, recompute values using the original generation formula:
Step 3.1: Add temporary column for computation
ALTER TABLE recovered_t1 ADD COLUMN d_computed INT;
Step 3.2: Populate using generation logic
UPDATE recovered_t1 SET d_computed = a * abs(b);
Step 3.3: Verify against recovered data (if available)
SELECT d_computed, lost_and_found.c0
FROM recovered_t1
JOIN lost_and_found ON recovered_t1.a = lost_and_found.id;
4. Stored Column Integrity Validation
Ensure stored column values match their original generation logic:
Step 4.1: Compute expected values
SELECT e, substr(c,b,b+1) AS expected_e
FROM recovered_t1;
Step 4.2: Identify discrepancies
SELECT * FROM (
SELECT e, substr(c,b,b+1) AS expected_e
FROM recovered_t1
) WHERE e != expected_e;
5. Prevention Strategies for Future Recovery
Strategy 5.1: Regular schema exports
sqlite3 production.db .schema > schema_backup_$(date +%s).sql
Strategy 5.2: Custom dump format preserving generated columns
sqlite3 production.db <<EOF
.output data_dump.sql
.mode insert
SELECT * FROM t1;
EOF
Strategy 5.3: Use triggers as persistent alternatives
-- For virtual column simulation
CREATE TRIGGER t1_d_calc
AFTER INSERT ON t1
BEGIN
UPDATE t1 SET d = NEW.a * abs(NEW.b) WHERE a = NEW.a;
END;
6. Advanced Recovery Using Write-Ahead Log (WAL)
Step 6.1: Locate WAL file
ls -l /path/to/database/-wal
Step 6.2: Attempt WAL-based recovery
sqlite3 recovered.db
PRAGMA journal_mode = DELETE;
ATTACH 'damaged.db' AS damaged;
ATTACH './gencol2.db-wal' AS wal;
Step 6.3: Extract committed transactions
INSERT INTO main.recovered_t1
SELECT * FROM damaged.t1
WHERE rowid IN (
SELECT rowid FROM wal.t1
);
7. Binary Recovery Techniques
For severely damaged databases:
Step 7.1: Extract raw pages
dd if=gencol2.db of=page2.raw bs=4096 count=1 skip=1
Step 7.2: Analyze page structure
sqlite3_analyzer page2.raw
Step 7.3: Manual record reconstruction
import struct
with open('page2.raw', 'rb') as f:
header = f.read(100)
# Parse SQLite page header format
page_type, first_freeblock, cell_count = struct.unpack('>BHH', header[0:5])
8. SQLite Internals Considerations
Understanding how SQLite stores generated columns:
Column Type | Storage Location | Recovery Potential |
---|---|---|
VIRTUAL | Not stored | Requires recompute |
STORED | Internal B-tree | Raw value recovery |
Key implementation details:
- Virtual columns use no storage space (computed at SELECT time)
- Stored columns occupy physical storage like regular columns
- Both types store generation expressions in
sqlite_schema
table
9. Alternative Recovery Pathways
Path 9.1: Use PRAGMA integrity_check
before recovery
PRAGMA integrity_check;
Path 9.2: Leverage sqlite3_dbconfig
defensive options
sqlite3_db_config(db, SQLITE_DBCONFIG_DEFENSIVE, 1, 0);
Path 9.3: Employ FTS5 shadow tables for auxiliary data
CREATE VIRTUAL TABLE t1_fts USING fts5(a, b, c);
10. Post-Recovery Validation Framework
Check 10.1: Schema consistency verification
SELECT name, sql FROM sqlite_schema
WHERE type='table' AND name='t1';
Check 10.2: Data integrity assessment
SELECT count(*) FROM t1
WHERE d != a * abs(b)
OR e != substr(c, b, b+1);
Check 10.3: Index consistency checks
PRAGMA quick_check;
PRAGMA foreign_key_check;
This comprehensive approach addresses both immediate data recovery needs and implements safeguards against future generated column recovery failures. The techniques combine SQLite’s native capabilities with low-level data manipulation strategies, ensuring maximum recoverability while maintaining the computational integrity of generated columns.