ATTACHed Database cache_size Pragma Not Respected in SQLite: Causes & Fixes
Cross-Database Query Performance Mismatch Due to cache_size Configuration
When working with multiple SQLite databases in a single connection, users often leverage the ATTACH DATABASE
command to execute cross-database queries. A critical performance optimization in such scenarios involves configuring the cache_size
pragma to allocate sufficient memory for caching database pages. However, a common pitfall arises when the cache_size
setting is applied only to the main database (the database opened at connection time) while neglecting to explicitly configure the cache for ATTACHed databases.
The core issue manifests as follows:
- A smaller "control" database is opened as the main database.
- A larger database (e.g., 70GB) is attached to this connection.
- Despite setting
PRAGMA cache_size=<value>
, queries against the larger database exhibit suboptimal performance due to inadequate page caching. - The inverse scenario—attaching the smaller database to the larger main database—yields expected cache benefits.
This discrepancy occurs because SQLite’s cache_size
pragma is database-specific by default. When unqualified (e.g., PRAGMA cache_size=10000
), it applies only to the main database. ATTACHed databases retain their own independent cache configurations, which default to 2,000 pages (2MB in standard configurations). For large databases, this default is insufficient to keep frequently accessed pages in memory, leading to excessive disk I/O and degraded query performance.
Database-Specific Pragmas and Schema Context Misconfiguration
SQLite’s architecture treats each database in a multi-database connection as a separate schema with isolated settings. The cache_size
pragma operates within the context of a specific schema, and its scope is determined by either:
- The default schema (usually
main
), which corresponds to the database opened at connection time. - An explicitly named schema (e.g.,
attached_db
) for ATTACHed databases.
Root Causes of cache_size Mismanagement
- Implicit Scope Assignment: Executing
PRAGMA cache_size=<value>
without a schema qualifier (e.g.,main.cache_size
) applies the setting exclusively to themain
database. ATTACHed databases inherit their own default cache_size unless explicitly overridden. - Default Cache Allocation: SQLite initializes each database’s page cache to 2,000 pages (2MB) unless configured otherwise. For a 70GB database, this default is trivial, often resulting in <0.01% of the database being cached.
- Schema-Aware Pragma Execution Failure: Users unfamiliar with SQLite’s pragma scoping may assume
PRAGMA cache_size
applies globally to all databases in the connection. This misconception leads to misconfigured ATTACHed databases. - Post-ATTACH Configuration Oversight: The
cache_size
of an ATTACHed database must be set after theATTACH
command. Pre-configuring the pragma before attaching the database has no effect.
Impact of Improper Cache Sizing
- Page Cache Thrashing: Insufficient cache forces SQLite to repeatedly load and evict pages from memory, increasing I/O latency.
- Wasted Memory Resources: The main database’s cache may be over-provisioned while the ATTACHed database’s cache remains underutilized.
- Unpredictable Query Performance: Queries against the ATTACHed database experience variable execution times due to inconsistent page availability.
Schema-Qualified cache_size Configuration and Validation
Resolving this issue requires explicit configuration of the cache_size
pragma for each ATTACHed database, using schema-qualified pragmas. Below is a comprehensive guide to diagnosing and rectifying the problem.
Step 1: Validate Current Cache Settings
Before adjusting configurations, confirm the current cache_size
for all databases in the connection:
PRAGMA main.cache_size; -- Returns cache_size for the main database
PRAGMA attached_db.cache_size; -- Returns cache_size for the ATTACHed database
Replace attached_db
with the schema name used in the ATTACH
command. If the ATTACHed database’s cache_size is 2000 (the default), it must be reconfigured.
Step 2: Set Schema-Specific cache_size
After attaching the database, execute a schema-qualified cache_size
pragma:
ATTACH 'path/to/large_db.sqlite' AS large_db;
PRAGMA large_db.cache_size = -10000; -- Allocate 10,000 pages (10MB)
Key Notes:
- Negative values (e.g.,
-10000
) set the cache size in kibibytes (KiB), with SQLite automatically converting this to pages. - Positive values (e.g.,
10000
) set the size directly in pages. UsePRAGMA large_db.page_size;
to determine the page size (default: 4096 bytes). - The maximum cache size is platform-dependent but typically defaults to 2,000,000,000 pages (≈8TB for 4KB pages).
Step 3: Persist Cache Configuration
SQLite resets pragma settings to their defaults when the database is closed. To persist cache_size
across sessions:
- Reconfigure on Connection Open: Execute the schema-qualified
PRAGMA
after attaching the database in every connection. - Use Connection Pooling: If using a connection pool, ensure each new connection reconfigures the cache_size.
- Leverage SQLite’s Auto-Configuration: For applications with stable workloads, set
cache_size
during database initialization.
Step 4: Monitor Cache Utilization
Use SQLite’s sqlite3_status()
API or diagnostic pragmas to assess cache effectiveness:
PRAGMA main.stats; -- Provides memory usage statistics for the main database
PRAGMA large_db.stats;
High page_cache_miss
values indicate excessive disk I/O due to cache misses.
Step 5: Optimize Cache Size Dynamically
For workloads with varying demands, adjust cache_size
dynamically based on operational phases:
-- During bulk data ingestion to large_db
PRAGMA large_db.cache_size = -50000; -- 50MB
-- During query-heavy phases
PRAGMA large_db.cache_size = -200000; -- 200MB
Advanced: Shared Cache Mode Considerations
In shared cache mode (deprecated), multiple connections share a single cache. This mode complicates cache_size management, as the pragma applies to the entire shared cache. Avoid shared cache mode for multi-database configurations requiring independent cache tuning.
Summary of Fixes and Best Practices
- Always Qualify Pragmas for ATTACHed Databases: Use
<schema>.cache_size
instead of the unqualified form. - Set cache_size Post-ATTACH: Configuration must occur after attaching the database.
- Align Cache Size to Database Workload: Larger databases benefit from larger caches. A 70GB database with a 4KB page size has ≈18 million pages; a 100,000-page cache (≈400MB) caches 0.55% of the database.
- Validate Settings Across Connections: Ensure configurations are reapplied in pooled or reused connections.
- Benchmark and Iterate: Adjust cache_size based on performance metrics and workload patterns.
By adhering to these guidelines, users can eliminate the performance disparity between main and ATTACHed databases, ensuring optimal utilization of SQLite’s page cache across multi-database environments.