FTS4 Virtual Table Corruption: SQLITE_CORRUPT_VTAB When Querying Columns Despite Integrity Check


FTS4 Shadow Table Corruption Leading to Column-Specific Query Failures

The core issue involves an SQLite database with an FTS4 virtual table that passes a PRAGMA integrity_check but throws a SQLITE_CORRUPT_VTAB error when querying specific columns or invoking built-in FTS4 auxiliary functions like match_info() or snippet(). The virtual table returns all data via SELECT *, but column-level access fails with a "database disk malformed" error. This discrepancy arises because FTS4 virtual tables rely on hidden shadow tables to manage indexed content and metadata. While the main database structure passes integrity checks, corruption in these shadow tables—or inconsistencies between the virtual table’s logical schema and its underlying physical storage—disrupt columnar queries and auxiliary function execution. The error manifests specifically when SQLite attempts to access fragmented or missing data within the shadow tables during column projection or FTS4-specific operations.


Root Causes: Shadow Table Integrity and FTS4 Implementation Nuances

The SQLITE_CORRUPT_VTAB error in this context is tied to three primary factors: incomplete or corrupted shadow table records, mismatched schema definitions, or custom build anomalies.

FTS4 virtual tables depend on a set of shadow tables prefixed with the virtual table’s name (e.g., %_content, %_segdir, %_stat). These tables store tokenized data, segment metadata, and optimization statistics. Corruption in these tables—such as missing rows in %_content, invalid segment pointers in %_segdir, or misaligned %_stat entries—can leave the virtual table partially functional. For example, SELECT * may work because it reads directly from the %_content table, but column-specific queries require SQLite to reconstruct data from indexed segments, which involves cross-referencing shadow tables. If a shadow table entry is missing or contains invalid binary data, columnar access fails.

A second cause is schema mismatches. If the virtual table’s declared schema (e.g., column count, data types) does not align with the shadow tables’ stored data, SQLite’s query planner may misinterpret the physical storage layout. This can occur if the virtual table was altered without rebuilding its shadow tables or if manual edits were made to the shadow tables.

The third factor is the use of a custom SQLite build. Modifications to SQLite’s core or FTS4 module—such as altered tokenizers, custom functions, or changes to the shadow table management logic—can introduce edge cases where the virtual table’s internal consistency checks pass (via PRAGMA integrity_check) but runtime operations fail. Built-in FTS4 functions like match_info() rely on low-level APIs that assume specific shadow table structures; custom builds that alter these APIs may break compatibility. However, user-defined functions that do not interact with shadow tables (e.g., scalar utilities) will continue to work, as observed in the original issue.


Resolving Shadow Table Corruption and Restoring FTS4 Functionality

Step 1: Rebuild the FTS4 Index
The first mitigation is to rebuild the FTS4 virtual table’s index using the rebuild command. Execute:

INSERT INTO fts_table(fts_table) VALUES('rebuild');

This triggers a reindexing process that regenerates all shadow tables from the source data. If the original data is intact, this resolves inconsistencies caused by partial writes, interrupted optimizations, or transient corruption. Verify success by re-executing a column-specific query (e.g., SELECT column1 FROM fts_table). If the error persists, proceed to deeper diagnostics.

Step 2: Audit Shadow Tables Manually
Inspect the shadow tables directly to identify missing or malformed records. For a virtual table named fts_table, query:

SELECT COUNT(*) FROM fts_table_content;  -- Expect non-zero
SELECT * FROM fts_table_segdir ORDER BY level, idx;  -- Check for valid (start_block, end_block) pairs
SELECT * FROM fts_table_stat;  -- Verify 'value' fields are integers

Compare the results against a known-good FTS4 table. If fts_table_content is empty or fts_table_segdir contains negative block indices, the shadow tables are irreparably corrupted. Export the source data, drop the virtual table, and recreate it.

Step 3: Migrate to FTS5
FTS5’s improved storage format and consistency checks reduce susceptibility to shadow table corruption. Export the FTS4 data to a temporary table, drop the FTS4 virtual table, and recreate it with FTS5:

CREATE VIRTUAL TABLE fts_table_new USING fts5(column1, column2);
INSERT INTO fts_table_new SELECT * FROM fts_table;
DROP TABLE fts_table;
ALTER TABLE fts_table_new RENAME TO fts_table;

FTS5’s stricter transaction handling and atomic writes mitigate many issues caused by interrupted operations.

Step 4: Validate the Custom SQLite Build
If rebuilding the index and migrating to FTS5 fail, audit the custom SQLite build for FTS4-specific modifications. Revert to an unmodified SQLite build and test the database. If the error disappears, the custom build likely introduced instability in FTS4’s shadow table management. Key areas to review include:

  • Tokenizer implementations that mishandle edge cases (e.g., non-ASCII text).
  • Overrides of xCreate, xConnect, or xBestIndex in the FTS4 virtual table module.
  • Memory allocation changes (e.g., custom sqlite3_malloc hooks) that disrupt FTS4’s storage layer.

Step 5: Salvage Data via .dump and Reimport
For severe corruption, use SQLite’s .dump command to export the database schema and data:

sqlite3 corrupted.db .dump > backup.sql

Edit backup.sql to remove FTS4 virtual table definitions and shadow tables. Recreate the database:

sqlite3 new.db < backup.sql

Recreate the FTS4 table and repopulate it from the salvaged data. This bypasses corrupted shadow tables entirely.

Final Measure: Enable Write-Ahead Logging (WAL)
To prevent future corruption, enable WAL mode:

PRAGMA journal_mode=WAL;

WAL reduces the risk of partial writes during crashes, which is a common cause of shadow table inconsistencies. Combine this with periodic VACUUM operations to defragment the database.

By systematically addressing shadow table integrity, validating the SQLite build, and migrating to FTS5, the SQLITE_CORRUPT_VTAB error can be resolved while hardening the database against recurrence.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *