Recovering and Preventing SQLITE_CORRUPT_VTAB in FTS5 Contentless Tables
Virtual Table Corruption in FTS5 Contentless Tables During Trigger-Driven Updates
Issue Overview: Malformed Disk Image Due to FTS5 Trigger Interactions
The core problem arises when a trigger modifies an FTS5 virtual table (configured as a contentless table) during an INSERT ... ON CONFLICT DO UPDATE
operation. The trigger attempts to delete a non-existent row from the FTS5 table, leading to a malformed database disk image with error code SQLITE_CORRUPT_VTAB
. This occurs because the FTS5 contentless table relies on external synchronization with its underlying base table. When the trigger logic assumes the existence of rows that were never indexed in the FTS5 table (due to deferred trigger creation or incomplete data synchronization), the virtual table’s internal state becomes inconsistent. The corruption manifests during transactions that modify both the base table and its associated FTS5 table without proper atomicity guarantees.
The error is exacerbated by two design choices:
- Contentless FTS5 Configuration: The FTS5 table is declared with
content=''
, meaning it does not store a copy of the indexed data but instead relies on external triggers to maintain its index. This requires absolute synchronization between base table operations and FTS5 updates. - Deferred Trigger Initialization: Triggers responsible for updating the FTS5 table are created after the initial bulk load of data into the base table. This creates a window where updates to the base table (via
ON CONFLICT DO UPDATE
) fire triggers that manipulate an FTS5 table that lacks corresponding entries for existing rows.
The ON CONFLICT DO UPDATE
clause complicates trigger behavior. When resolving a conflict, it executes an update operation, which invokes the AFTER UPDATE
trigger. If the FTS5 table was not populated during the initial insert phase (due to missing AFTER INSERT
triggers at that time), the trigger’s DELETE
operation targets a row that does not exist in the FTS5 index. This violates the integrity of the contentless FTS5 table, as its internal rowid mappings become desynchronized from the base table.
Possible Causes: Desynchronization Between Base Tables and FTS5 Indexes
- Premature FTS5 Index Modifications: The
AFTER UPDATE
trigger modifies the FTS5 table before the initial data population is complete. If the base table already contains rows that were never indexed in the FTS5 table (due to triggers being added post-load), subsequent updates will attempt to delete non-existent FTS5 entries. - Transaction Rollback Failures: When a transaction that modifies both the base table and the FTS5 table is rolled back, SQLite’s atomicity guarantees may not fully apply to virtual tables. If the FTS5 module cannot roll back its changes (due to internal state inconsistencies), the table becomes corrupted.
- Contentless Table Limitations: Contentless FTS5 tables lack built-in mechanisms to validate rowid consistency with the base table. Trigger logic must enforce this manually, and any oversight (e.g., deleting a non-existent rowid) directly corrupts the index.
- Concurrent Write Operations: Although not explicitly mentioned in the discussion, concurrent writes to the base table and FTS5 index (e.g., in multi-threaded environments) could exacerbate desynchronization if transactions are not properly isolated.
Troubleshooting Steps, Solutions & Fixes
Step 1: Recover the Corrupted FTS5 Table
1.1 Export Base Table Data
Since the FTS5 table is corrupted, begin by exporting the data from the base table (figureCitations
) to a temporary storage (e.g., a CSV file or an in-memory database):
-- Connect to a new database for recovery
ATTACH DATABASE ':memory:' AS recovery_db;
-- Recreate the base table structure in the recovery database
CREATE TABLE recovery_db.figureCitations AS SELECT * FROM main.figureCitations;
1.2 Rebuild the FTS5 Table
Contentless FTS5 tables cannot be rebuilt using the rebuild
command. Instead, drop and recreate the FTS5 table and its triggers:
-- In the main database
DROP TRIGGER IF EXISTS fc_afterUpdate;
DROP TABLE IF EXISTS figureCitationsFts;
-- Recreate the FTS5 table
CREATE VIRTUAL TABLE figureCitationsFts USING fts5(
captionText,
content='figureCitations',
content_rowid='id'
);
-- Recreate triggers with synchronization checks
CREATE TRIGGER fc_afterUpdate AFTER UPDATE ON figureCitations
BEGIN
-- Only delete if the old row exists in the FTS table
DELETE FROM figureCitationsFts WHERE rowid = old.id;
INSERT INTO figureCitationsFts(rowid, captionText) VALUES (new.id, new.captionText);
END;
1.3 Reinitialize the FTS5 Index
Repopulate the FTS5 table by forcing a reindexing of all base table rows:
INSERT INTO figureCitationsFts(figureCitationsFts, rowid, captionText)
SELECT 'delete', id, captionText FROM figureCitations;
INSERT INTO figureCitationsFts(rowid, captionText)
SELECT id, captionText FROM figureCitations;
Step 2: Prevent Future Corruption
2.1 Atomic Trigger Initialization
Ensure triggers are created before any data is inserted into the base table. If bulk loading is necessary, disable triggers during the initial load and enable them afterward:
-- Disable triggers during bulk load
PRAGMA defer_foreign_keys = 1;
BEGIN TRANSACTION;
-- Bulk insert operations here
COMMIT;
-- Create triggers after initial load
CREATE TRIGGER fc_afterInsert AFTER INSERT ON figureCitations
BEGIN
INSERT INTO figureCitationsFts(rowid, captionText) VALUES (new.id, new.captionText);
END;
CREATE TRIGGER fc_afterUpdate AFTER UPDATE ON figureCitations ...; -- As before
2.2 Add Existence Checks in Triggers
Modify triggers to avoid modifying non-existent FTS5 rows. Use a subquery to verify the existence of the old rowid before deleting:
CREATE TRIGGER fc_afterUpdate AFTER UPDATE ON figureCitations
BEGIN
-- Check if the old rowid exists in the FTS table
DELETE FROM figureCitationsFts
WHERE rowid = old.id AND EXISTS (
SELECT 1 FROM figureCitationsFts WHERE rowid = old.id
);
INSERT INTO figureCitationsFts(rowid, captionText)
VALUES (new.id, new.captionText);
END;
2.3 Use Content-Enabled FTS5 Tables If Possible
If the contentless configuration is not strictly required, use a standard FTS5 table with content='figureCitations'
. This allows SQLite to manage the index automatically, reducing the risk of desynchronization:
CREATE VIRTUAL TABLE figureCitationsFts USING fts5(
captionText,
content='figureCitations',
content_rowid='id'
);
Step 3: Transaction and Error Handling Improvements
3.1 Enforce Transaction Boundaries
Wrap all operations affecting both the base table and FTS5 index in explicit transactions. Use savepoints to allow partial rollbacks:
// In Node.js with Better-SQLite3
const db = require('better-sqlite3')('database.db');
const insertStatement = db.prepare(`
INSERT INTO figureCitations (...) VALUES (...) ON CONFLICT ...;
`);
db.transaction(() => {
try {
insertStatement.run(...);
// Additional operations
} catch (e) {
// Rollback will be automatic
throw e;
}
})();
3.2 Validate FTS5 Integrity Programmatically
Periodically check the integrity of the FTS5 table using PRAGMA integrity_check
or custom queries:
-- Check for orphaned FTS5 rowids
SELECT fts.rowid
FROM figureCitationsFts fts
LEFT JOIN figureCitations fc ON fts.rowid = fc.id
WHERE fc.id IS NULL;
3.3 Implement Retry Logic for Transactions
In the application layer, add retry logic for transactions that fail due to transient errors like SQLITE_CORRUPT_VTAB
:
function runWithRetry(transactionFunc, maxRetries = 3) {
let attempts = 0;
while (attempts < maxRetries) {
try {
return db.transaction(transactionFunc)();
} catch (e) {
if (e.code === 'SQLITE_CORRUPT_VTAB' && attempts < maxRetries - 1) {
recoverFtsTable(); // Implement recovery steps
attempts++;
} else {
throw e;
}
}
}
}
Final Notes
The SQLITE_CORRUPT_VTAB
error in FTS5 contentless tables is a direct consequence of incomplete or incorrect synchronization between base tables and their associated virtual indexes. By enforcing atomic trigger initialization, adding existence checks in trigger logic, and adopting content-enabled FTS5 tables where feasible, the risk of corruption is minimized. Recovery requires rebuilding the FTS5 index from scratch while ensuring all future operations adhere to strict transactional boundaries.