FTS5 Virtual Table Data Missing in SQLite Dump: Causes and Workarounds
Understanding FTS5 Virtual Table Storage Architecture
SQLite’s Full-Text Search Version 5 (FTS5) implements virtual tables that appear as normal tables to SQL queries but store their data in underlying shadow tables. When creating an FTS5 virtual table named t1
, SQLite automatically generates several auxiliary tables:
t1_data
(stores segment data)t1_idx
(term-to-segment mapping)t1_content
(original document storage)t1_docsize
(document size metadata)t1_config
(FTS5 configuration parameters)
The .dump
command’s default behavior only captures the virtual table declaration from sqlite_schema
, not the shadow tables containing actual data. This occurs because:
- Virtual tables lack physical storage in the database file
- Shadow tables use distinct naming conventions not automatically linked to their parent virtual table
- SQLite’s schema dumping logic historically treated virtual tables as declarative objects rather than data containers
When executing .dump t1
, the output contains only the virtual table creation statement. Full data preservation requires explicitly dumping all related shadow tables, which the standard command didn’t handle until specific fixes were implemented.
Root Causes of Incomplete FTS5 Data Dumping
1. Target-Specific Dump Scope Limitations
The .dump [TABLE]
command filters output using exact name matching against sqlite_schema
. Since FTS5 shadow tables have names derived from but not identical to their parent virtual table (e.g., t1_content
), they’re excluded from targeted dumps. This manifests as:
# Problematic command missing shadow tables
sqlite3 database.db ".dump t1" > dump.sql
# Resulting dump contains only:
CREATE VIRTUAL TABLE t1 USING fts5(x);
2. Shadow Table Naming Convention Mismatch
FTS5’s automatic shadow table generation uses ${VIRTUAL_TABLE}_${SUFFIX}
naming (e.g., t1_config
). The SQLite shell’s pattern matching in version 3.35.0 and earlier didn’t support wildcard expansion for partial table names when using .dump
. This led to scenarios where:
.dump t1% -- Works (matches t1_config, t1_content, etc.)
.dump t1 -- Fails (matches only virtual table declaration)
3. Virtual Table Registration Timing Issues
When restoring from dumps containing only the virtual table declaration (without shadow tables), SQLite doesn’t automatically recreate the shadow table structure until the virtual table is accessed. This creates a chicken-and-egg problem:
- Dump contains
CREATE VIRTUAL TABLE
statement - Restored database has no shadow tables
- Querying the virtual table triggers shadow table creation
- Newly created shadow tables lack original data
Comprehensive Data Preservation Strategies
1. Wildcard-Enabled Dump Commands
For SQLite 3.35.0+, use the percentage wildcard to capture all related tables:
# Dump virtual table and all shadow tables
sqlite3 database.db ".dump 't1%'" > full_dump.sql
# Verify dump contains:
CREATE VIRTUAL TABLE t1 USING fts5(x);
CREATE TABLE t1_data(...);
INSERT INTO t1_data ...;
Implementation Notes:
- Requires shell version with commit b0bc5ab9ceec496a
- Wildcard must be quoted to prevent shell expansion
- Works for both FTS5 and other virtual table modules
2. Full Database Dump Methodology
When dealing with multiple FTS5 tables, dump the entire database to ensure all shadow tables are captured:
sqlite3 database.db ".dump" > full_backup.sql
Restoration Verification Steps:
- Create new database:
sqlite3 restored.db < full_backup.sql
- Check shadow table existence:
SELECT name FROM sqlite_schema WHERE name LIKE 't1_%';
- Validate row counts:
SELECT (SELECT COUNT(*) FROM t1) AS virtual_count, (SELECT COUNT(*) FROM t1_content) AS shadow_count;
3. Version-Specific Upgrade Requirements
For environments using SQLite versions prior to 3.35.0 (2021-05-25):
- Check SQLite version:
SELECT sqlite_version();
- Upgrade using official binaries:
wget https://sqlite.org/2025/sqlite-tools-linux-x86-3450000.zip unzip sqlite-tools-*.zip
- Verify fix implementation:
./sqlite3 :memory: "CREATE VIRTUAL TABLE t1 USING fts5(x);" ./sqlite3 :memory: ".dump t1%" | grep 'CREATE TABLE t1_'
4. Manual Shadow Table Extraction
When unable to upgrade, manually specify all shadow tables:
sqlite3 database.db \
".dump t1 t1_data t1_idx t1_content t1_docsize t1_config" \
> manual_dump.sql
Automation Script Example:
#!/bin/bash
TABLE="t1"
SHADOW_TABLES=$(sqlite3 database.db "SELECT name FROM sqlite_schema
WHERE name LIKE '${TABLE}_%'")
sqlite3 database.db ".dump ${TABLE} ${SHADOW_TABLES}" > dump.sql
5. Alternative Backup Techniques
For large FTS5 databases, consider:
A. SQLite Backup API Integration
import sqlite3
source = sqlite3.connect('database.db')
dest = sqlite3.connect(':memory:')
with dest:
source.backup(dest)
dest.execute("VACUUM INTO 'backup.db'")
B. Filesystem Snapshotting
# For WAL mode databases
cp database.db database.backup.db
cp database.db-wal database.backup.db-wal
cp database.db-shm database.backup.db-shm
C. FTS5 Content Export
-- Export FTS5 content to JSON
.mode json
.once fts5_data.json
SELECT * FROM t1;
6. Disaster Recovery Protocols
When restoring from incomplete dumps:
- Identify missing shadow tables:
SELECT name FROM sqlite_schema WHERE rootpage=0 AND sql LIKE 'CREATE VIRTUAL TABLE%';
- Recreate shadow structure:
DROP TABLE t1; CREATE VIRTUAL TABLE t1 USING fts5(x);
- Extract data from backup:
ATTACH DATABASE 'backup.db' AS orig; INSERT INTO t1(rowid, x) SELECT rowid, x FROM orig.t1;
7. Prevention Best Practices
A. Database Health Monitoring
-- Check FTS5 integrity
SELECT * FROM t1 WHERE t1 MATCH 'integrity-check';
B. Automated Backup Verification
#!/bin/bash
sqlite3 backup.db <<EOF
CREATE TABLE dump_verification AS
SELECT
(SELECT COUNT(*) FROM t1) AS main_count,
(SELECT COUNT(*) FROM t1_content) AS content_count,
(SELECT COUNT(*) FROM t1_data) AS data_count;
EOF
# Alert if counts mismatch
sqlite3 backup.db "SELECT * FROM dump_verification" | \
awk '$1 != $2 || $2 != $3 {exit 1}'
C. Schema Version Tracking
CREATE TABLE schema_version (
id INTEGER PRIMARY KEY,
checksum TEXT,
created DATETIME DEFAULT CURRENT_TIMESTAMP
);
-- Generate schema checksum
INSERT INTO schema_version(checksum)
VALUES (hex(sha1(
(SELECT group_concat(sql) FROM sqlite_schema)
)));
Performance Considerations for Large FTS5 Databases
1. Dump Optimization Techniques
Technique | Command Example | Tradeoffs |
---|---|---|
Batch Size Control | .mode insert N_PER_INSERT | Memory vs I/O balance |
Transaction Chunking | BEGIN; ...; COMMIT EVERY X | Atomicity vs speed |
Parallel Processing | PRAGMA threads=4 | CPU utilization vs stability |
Storage Format Selection | .mode csv | Size vs import speed |
2. Index Maintenance Protocol
-- Optimize FTS5 index after restore
INSERT INTO t1(t1) VALUES('optimize');
3. Memory Configuration Tuning
PRAGMA cache_size = -100000; -- 100MB cache
PRAGMA temp_store = MEMORY;
PRAGMA mmap_size = 268435456; -- 256MB mmap
Cross-Version Compatibility Matrix
SQLite Version | FTS5 Dump Fix | Wildcard Support | Shadow Table Auto-Detection |
---|---|---|---|
<3.35.0 | No | Partial | Manual Only |
3.35.0-3.36.0 | Yes | With Quoting | Pattern-Based |
>3.37.0 | Yes | Automatic | Schema Analysis |
Advanced Diagnostic Techniques
1. FTS5 Internal State Inspection
-- Query internal FTS5 configuration
SELECT * FROM t1_config;
-- Examine segment structure
SELECT segment, COUNT(*) FROM t1_data GROUP BY segment;
-- Analyze term distribution
SELECT term, COUNT(*) FROM t1_idx GROUP BY term ORDER BY 2 DESC;
2. Explain Query Plan Verification
EXPLAIN QUERY PLAN
SELECT * FROM t1 WHERE t1 MATCH 'recovery';
3. WAL Mode Recovery Procedures
sqlite3 damaged.db <<EOF
PRAGMA journal_mode=DELETE;
PRAGMA integrity_check;
REINDEX t1;
VACUUM;
EOF
Enterprise Deployment Recommendations
- Implement automated backup validation pipelines
- Use checksum verification for critical dumps
- Schedule regular FTS5 integrity checks
- Maintain version-controlled schema definitions
- Deploy canary databases for restore testing
- Monitor shadow table growth patterns
- Establish rollback procedures for failed migrations
This comprehensive guide provides database administrators and developers with the technical depth needed to resolve FTS5 dumping issues while maintaining data integrity across SQLite deployments. The strategies outlined address both immediate recovery needs and long-term prevention of data loss scenarios through architectural best practices and systematic verification protocols.