FTS5 Virtual Table Data Missing in SQLite Dump: Causes and Workarounds

Understanding FTS5 Virtual Table Storage Architecture

SQLite’s Full-Text Search Version 5 (FTS5) implements virtual tables that appear as normal tables to SQL queries but store their data in underlying shadow tables. When creating an FTS5 virtual table named t1, SQLite automatically generates several auxiliary tables:

  • t1_data (stores segment data)
  • t1_idx (term-to-segment mapping)
  • t1_content (original document storage)
  • t1_docsize (document size metadata)
  • t1_config (FTS5 configuration parameters)

The .dump command’s default behavior only captures the virtual table declaration from sqlite_schema, not the shadow tables containing actual data. This occurs because:

  1. Virtual tables lack physical storage in the database file
  2. Shadow tables use distinct naming conventions not automatically linked to their parent virtual table
  3. SQLite’s schema dumping logic historically treated virtual tables as declarative objects rather than data containers

When executing .dump t1, the output contains only the virtual table creation statement. Full data preservation requires explicitly dumping all related shadow tables, which the standard command didn’t handle until specific fixes were implemented.

Root Causes of Incomplete FTS5 Data Dumping

1. Target-Specific Dump Scope Limitations

The .dump [TABLE] command filters output using exact name matching against sqlite_schema. Since FTS5 shadow tables have names derived from but not identical to their parent virtual table (e.g., t1_content), they’re excluded from targeted dumps. This manifests as:

# Problematic command missing shadow tables
sqlite3 database.db ".dump t1" > dump.sql

# Resulting dump contains only:
CREATE VIRTUAL TABLE t1 USING fts5(x);

2. Shadow Table Naming Convention Mismatch

FTS5’s automatic shadow table generation uses ${VIRTUAL_TABLE}_${SUFFIX} naming (e.g., t1_config). The SQLite shell’s pattern matching in version 3.35.0 and earlier didn’t support wildcard expansion for partial table names when using .dump. This led to scenarios where:

.dump t1%  -- Works (matches t1_config, t1_content, etc.)
.dump t1   -- Fails (matches only virtual table declaration)

3. Virtual Table Registration Timing Issues

When restoring from dumps containing only the virtual table declaration (without shadow tables), SQLite doesn’t automatically recreate the shadow table structure until the virtual table is accessed. This creates a chicken-and-egg problem:

  1. Dump contains CREATE VIRTUAL TABLE statement
  2. Restored database has no shadow tables
  3. Querying the virtual table triggers shadow table creation
  4. Newly created shadow tables lack original data

Comprehensive Data Preservation Strategies

1. Wildcard-Enabled Dump Commands

For SQLite 3.35.0+, use the percentage wildcard to capture all related tables:

# Dump virtual table and all shadow tables
sqlite3 database.db ".dump 't1%'" > full_dump.sql

# Verify dump contains:
CREATE VIRTUAL TABLE t1 USING fts5(x);
CREATE TABLE t1_data(...);
INSERT INTO t1_data ...;

Implementation Notes:

  • Requires shell version with commit b0bc5ab9ceec496a
  • Wildcard must be quoted to prevent shell expansion
  • Works for both FTS5 and other virtual table modules

2. Full Database Dump Methodology

When dealing with multiple FTS5 tables, dump the entire database to ensure all shadow tables are captured:

sqlite3 database.db ".dump" > full_backup.sql

Restoration Verification Steps:

  1. Create new database:
    sqlite3 restored.db < full_backup.sql
    
  2. Check shadow table existence:
    SELECT name FROM sqlite_schema WHERE name LIKE 't1_%';
    
  3. Validate row counts:
    SELECT (SELECT COUNT(*) FROM t1) AS virtual_count,
           (SELECT COUNT(*) FROM t1_content) AS shadow_count;
    

3. Version-Specific Upgrade Requirements

For environments using SQLite versions prior to 3.35.0 (2021-05-25):

  1. Check SQLite version:
    SELECT sqlite_version();
    
  2. Upgrade using official binaries:
    wget https://sqlite.org/2025/sqlite-tools-linux-x86-3450000.zip
    unzip sqlite-tools-*.zip
    
  3. Verify fix implementation:
    ./sqlite3 :memory: "CREATE VIRTUAL TABLE t1 USING fts5(x);"
    ./sqlite3 :memory: ".dump t1%" | grep 'CREATE TABLE t1_'
    

4. Manual Shadow Table Extraction

When unable to upgrade, manually specify all shadow tables:

sqlite3 database.db \
  ".dump t1 t1_data t1_idx t1_content t1_docsize t1_config" \
  > manual_dump.sql

Automation Script Example:

#!/bin/bash
TABLE="t1"
SHADOW_TABLES=$(sqlite3 database.db "SELECT name FROM sqlite_schema
  WHERE name LIKE '${TABLE}_%'")
sqlite3 database.db ".dump ${TABLE} ${SHADOW_TABLES}" > dump.sql

5. Alternative Backup Techniques

For large FTS5 databases, consider:

A. SQLite Backup API Integration

import sqlite3

source = sqlite3.connect('database.db')
dest = sqlite3.connect(':memory:')

with dest:
    source.backup(dest)

dest.execute("VACUUM INTO 'backup.db'")

B. Filesystem Snapshotting

# For WAL mode databases
cp database.db database.backup.db
cp database.db-wal database.backup.db-wal
cp database.db-shm database.backup.db-shm

C. FTS5 Content Export

-- Export FTS5 content to JSON
.mode json
.once fts5_data.json
SELECT * FROM t1;

6. Disaster Recovery Protocols

When restoring from incomplete dumps:

  1. Identify missing shadow tables:
    SELECT name FROM sqlite_schema 
    WHERE rootpage=0 AND sql LIKE 'CREATE VIRTUAL TABLE%';
    
  2. Recreate shadow structure:
    DROP TABLE t1;
    CREATE VIRTUAL TABLE t1 USING fts5(x);
    
  3. Extract data from backup:
    ATTACH DATABASE 'backup.db' AS orig;
    INSERT INTO t1(rowid, x) SELECT rowid, x FROM orig.t1;
    

7. Prevention Best Practices

A. Database Health Monitoring

-- Check FTS5 integrity
SELECT * FROM t1 WHERE t1 MATCH 'integrity-check';

B. Automated Backup Verification

#!/bin/bash
sqlite3 backup.db <<EOF
CREATE TABLE dump_verification AS 
  SELECT 
    (SELECT COUNT(*) FROM t1) AS main_count,
    (SELECT COUNT(*) FROM t1_content) AS content_count,
    (SELECT COUNT(*) FROM t1_data) AS data_count;
EOF

# Alert if counts mismatch
sqlite3 backup.db "SELECT * FROM dump_verification" | \
  awk '$1 != $2 || $2 != $3 {exit 1}'

C. Schema Version Tracking

CREATE TABLE schema_version (
  id INTEGER PRIMARY KEY,
  checksum TEXT,
  created DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Generate schema checksum
INSERT INTO schema_version(checksum)
VALUES (hex(sha1(
  (SELECT group_concat(sql) FROM sqlite_schema)
)));

Performance Considerations for Large FTS5 Databases

1. Dump Optimization Techniques

TechniqueCommand ExampleTradeoffs
Batch Size Control.mode insert N_PER_INSERTMemory vs I/O balance
Transaction ChunkingBEGIN; ...; COMMIT EVERY XAtomicity vs speed
Parallel ProcessingPRAGMA threads=4CPU utilization vs stability
Storage Format Selection.mode csvSize vs import speed

2. Index Maintenance Protocol

-- Optimize FTS5 index after restore
INSERT INTO t1(t1) VALUES('optimize');

3. Memory Configuration Tuning

PRAGMA cache_size = -100000;  -- 100MB cache
PRAGMA temp_store = MEMORY;
PRAGMA mmap_size = 268435456; -- 256MB mmap

Cross-Version Compatibility Matrix

SQLite VersionFTS5 Dump FixWildcard SupportShadow Table Auto-Detection
<3.35.0NoPartialManual Only
3.35.0-3.36.0YesWith QuotingPattern-Based
>3.37.0YesAutomaticSchema Analysis

Advanced Diagnostic Techniques

1. FTS5 Internal State Inspection

-- Query internal FTS5 configuration
SELECT * FROM t1_config;

-- Examine segment structure
SELECT segment, COUNT(*) FROM t1_data GROUP BY segment;

-- Analyze term distribution
SELECT term, COUNT(*) FROM t1_idx GROUP BY term ORDER BY 2 DESC;

2. Explain Query Plan Verification

EXPLAIN QUERY PLAN
SELECT * FROM t1 WHERE t1 MATCH 'recovery';

3. WAL Mode Recovery Procedures

sqlite3 damaged.db <<EOF
PRAGMA journal_mode=DELETE;
PRAGMA integrity_check;
REINDEX t1;
VACUUM;
EOF

Enterprise Deployment Recommendations

  1. Implement automated backup validation pipelines
  2. Use checksum verification for critical dumps
  3. Schedule regular FTS5 integrity checks
  4. Maintain version-controlled schema definitions
  5. Deploy canary databases for restore testing
  6. Monitor shadow table growth patterns
  7. Establish rollback procedures for failed migrations

This comprehensive guide provides database administrators and developers with the technical depth needed to resolve FTS5 dumping issues while maintaining data integrity across SQLite deployments. The strategies outlined address both immediate recovery needs and long-term prevention of data loss scenarios through architectural best practices and systematic verification protocols.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *