SQLite 3.40 PRAGMA Integrity_Check Enhancements and VACUUM INTO Documentation Clarifications
Affinity Validation in Non-STRICT Tables: Understanding the PRAGMA Integrity_Check Enhancements
The release of SQLite 3.40 introduced critical enhancements to the PRAGMA integrity_check
command, specifically targeting validation rules for columns in non-STRICT tables. A common misinterpretation of these enhancements revolves around whether the checks apply to STRICT or non-STRICT tables. The documentation initially stated that non-STRICT tables with TEXT affinity should not contain numeric values and vice versa for NUMERIC affinity columns. This raised questions about whether the documentation had inadvertently swapped "STRICT" and "non-STRICT" labels, given that STRICT tables enforce strict type adherence.
To clarify: non-STRICT tables are the correct focus of these enhancements. The confusion stems from misunderstanding how SQLite handles type affinity and storage. In non-STRICT tables, columns have type affinity (TEXT, NUMERIC, INTEGER, REAL, BLOB), which guides how values are converted during insertion. For example, inserting a numeric value into a TEXT affinity column converts it to a string representation. However, if the schema is manually altered (e.g., via PRAGMA writable_schema
) without updating stored data, the database can enter a state where stored types no longer match the expected affinity. This creates inconsistencies between the index and data storage, leading to incorrect query results when the query planner relies on indexes that assume affinity adherence.
Consider a non-STRICT table t1(a NUMERIC)
with an index on a
. If the schema is altered to CREATE TABLE t1(a TEXT)
without updating stored data, some rows may retain numeric storage types. Queries using the index might fail to retrieve these rows when searching for numeric or text equivalents, while full-table scans (without the index) would find them. The PRAGMA integrity_check
enhancements in 3.40 detect such mismatches by ensuring that stored values adhere to the affinity rules implied by the original schema. STRICT tables are excluded from these checks because they inherently enforce type restrictions at insertion, leaving no room for post-hoc affinity mismatches.
The VACUUM INTO Typographical Error: Resolving the "They PRAGMA Synchronous" Documentation Issue
A secondary documentation error in the SQLite 3.40 release notes involved the VACUUM INTO
statement’s reference to honoring "they PRAGMA synchronous setting." This was a typographical error where "they" should read "the." While minor, this error could cause confusion about whether VACUUM INTO
respects the synchronous pragma, which governs how aggressively SQLite flushes data to disk to ensure durability after write operations.
The PRAGMA synchronous
setting has three modes: OFF
(no synchronization), NORMAL
(synchronize at critical moments), and FULL
(synchronize after all writes). When VACUUM INTO
creates a new database file, it must honor this setting to maintain the desired balance between performance and data integrity. For instance, using PRAGMA synchronous=FULL
ensures that the vacuumed database is fully flushed to disk before the operation completes, reducing the risk of data corruption in the event of a system crash. The corrected documentation confirms that VACUUM INTO
adheres to this pragma, aligning its behavior with user expectations.
STRICT vs. Non-STRICT Table Validation: Resolving Affinity Corruption Through Schema Enforcement
The distinction between STRICT and non-STRICT tables is pivotal in understanding why the PRAGMA integrity_check
enhancements target the latter. STRICT tables, introduced in SQLite 3.37.0, enforce rigid type checking at insertion time. If a value does not match the declared column type (e.g., inserting a string into an INTEGER column), the operation is aborted. This eliminates the possibility of type affinity mismatches, as all values are stored exactly as specified. Consequently, STRICT tables do not require post-hoc affinity validation via PRAGMA integrity_check
—their integrity is enforced at the point of data entry.
Non-STRICT tables, however, rely on type affinity conversions. A NUMERIC affinity column will attempt to convert inserted text values to integers or real numbers. If the conversion is possible, the value is stored as a numeric type; otherwise, it remains text. This flexibility allows for "soft" type enforcement but introduces the risk of schema-data mismatches if the schema is altered improperly. The 3.40 enhancements to PRAGMA integrity_check
address this by verifying that stored values align with the expected affinity conversions. For example, in a non-STRICT table with NUMERIC affinity, all text values must be non-convertible to numbers (since convertible text should have been stored as numeric types). Violations indicate corruption, often due to manual schema edits that bypass SQLite’s affinity conversion logic.
To resolve such issues, users must ensure that the schema declared in sqlite_schema
matches the actual data storage. This can involve recreating tables with the correct schema, migrating data to adhere to the new affinity rules, or avoiding manual schema modifications unless absolutely necessary. The enhancements underscore SQLite’s commitment to data integrity while preserving the flexibility of non-STRICT tables for use cases where dynamic typing is advantageous.