ATTACHed Database cache_size Pragma Not Respected in SQLite: Causes & Fixes


Cross-Database Query Performance Mismatch Due to cache_size Configuration

When working with multiple SQLite databases in a single connection, users often leverage the ATTACH DATABASE command to execute cross-database queries. A critical performance optimization in such scenarios involves configuring the cache_size pragma to allocate sufficient memory for caching database pages. However, a common pitfall arises when the cache_size setting is applied only to the main database (the database opened at connection time) while neglecting to explicitly configure the cache for ATTACHed databases.

The core issue manifests as follows:

  • A smaller "control" database is opened as the main database.
  • A larger database (e.g., 70GB) is attached to this connection.
  • Despite setting PRAGMA cache_size=<value>, queries against the larger database exhibit suboptimal performance due to inadequate page caching.
  • The inverse scenario—attaching the smaller database to the larger main database—yields expected cache benefits.

This discrepancy occurs because SQLite’s cache_size pragma is database-specific by default. When unqualified (e.g., PRAGMA cache_size=10000), it applies only to the main database. ATTACHed databases retain their own independent cache configurations, which default to 2,000 pages (2MB in standard configurations). For large databases, this default is insufficient to keep frequently accessed pages in memory, leading to excessive disk I/O and degraded query performance.


Database-Specific Pragmas and Schema Context Misconfiguration

SQLite’s architecture treats each database in a multi-database connection as a separate schema with isolated settings. The cache_size pragma operates within the context of a specific schema, and its scope is determined by either:

  1. The default schema (usually main), which corresponds to the database opened at connection time.
  2. An explicitly named schema (e.g., attached_db) for ATTACHed databases.

Root Causes of cache_size Mismanagement

  1. Implicit Scope Assignment: Executing PRAGMA cache_size=<value> without a schema qualifier (e.g., main.cache_size) applies the setting exclusively to the main database. ATTACHed databases inherit their own default cache_size unless explicitly overridden.
  2. Default Cache Allocation: SQLite initializes each database’s page cache to 2,000 pages (2MB) unless configured otherwise. For a 70GB database, this default is trivial, often resulting in <0.01% of the database being cached.
  3. Schema-Aware Pragma Execution Failure: Users unfamiliar with SQLite’s pragma scoping may assume PRAGMA cache_size applies globally to all databases in the connection. This misconception leads to misconfigured ATTACHed databases.
  4. Post-ATTACH Configuration Oversight: The cache_size of an ATTACHed database must be set after the ATTACH command. Pre-configuring the pragma before attaching the database has no effect.

Impact of Improper Cache Sizing

  • Page Cache Thrashing: Insufficient cache forces SQLite to repeatedly load and evict pages from memory, increasing I/O latency.
  • Wasted Memory Resources: The main database’s cache may be over-provisioned while the ATTACHed database’s cache remains underutilized.
  • Unpredictable Query Performance: Queries against the ATTACHed database experience variable execution times due to inconsistent page availability.

Schema-Qualified cache_size Configuration and Validation

Resolving this issue requires explicit configuration of the cache_size pragma for each ATTACHed database, using schema-qualified pragmas. Below is a comprehensive guide to diagnosing and rectifying the problem.

Step 1: Validate Current Cache Settings

Before adjusting configurations, confirm the current cache_size for all databases in the connection:

PRAGMA main.cache_size;     -- Returns cache_size for the main database  
PRAGMA attached_db.cache_size; -- Returns cache_size for the ATTACHed database  

Replace attached_db with the schema name used in the ATTACH command. If the ATTACHed database’s cache_size is 2000 (the default), it must be reconfigured.

Step 2: Set Schema-Specific cache_size

After attaching the database, execute a schema-qualified cache_size pragma:

ATTACH 'path/to/large_db.sqlite' AS large_db;  
PRAGMA large_db.cache_size = -10000;  -- Allocate 10,000 pages (10MB)  

Key Notes:

  • Negative values (e.g., -10000) set the cache size in kibibytes (KiB), with SQLite automatically converting this to pages.
  • Positive values (e.g., 10000) set the size directly in pages. Use PRAGMA large_db.page_size; to determine the page size (default: 4096 bytes).
  • The maximum cache size is platform-dependent but typically defaults to 2,000,000,000 pages (≈8TB for 4KB pages).

Step 3: Persist Cache Configuration

SQLite resets pragma settings to their defaults when the database is closed. To persist cache_size across sessions:

  1. Reconfigure on Connection Open: Execute the schema-qualified PRAGMA after attaching the database in every connection.
  2. Use Connection Pooling: If using a connection pool, ensure each new connection reconfigures the cache_size.
  3. Leverage SQLite’s Auto-Configuration: For applications with stable workloads, set cache_size during database initialization.

Step 4: Monitor Cache Utilization

Use SQLite’s sqlite3_status() API or diagnostic pragmas to assess cache effectiveness:

PRAGMA main.stats;  -- Provides memory usage statistics for the main database  
PRAGMA large_db.stats;  

High page_cache_miss values indicate excessive disk I/O due to cache misses.

Step 5: Optimize Cache Size Dynamically

For workloads with varying demands, adjust cache_size dynamically based on operational phases:

-- During bulk data ingestion to large_db  
PRAGMA large_db.cache_size = -50000;  -- 50MB  

-- During query-heavy phases  
PRAGMA large_db.cache_size = -200000; -- 200MB  

Advanced: Shared Cache Mode Considerations

In shared cache mode (deprecated), multiple connections share a single cache. This mode complicates cache_size management, as the pragma applies to the entire shared cache. Avoid shared cache mode for multi-database configurations requiring independent cache tuning.


Summary of Fixes and Best Practices

  1. Always Qualify Pragmas for ATTACHed Databases: Use <schema>.cache_size instead of the unqualified form.
  2. Set cache_size Post-ATTACH: Configuration must occur after attaching the database.
  3. Align Cache Size to Database Workload: Larger databases benefit from larger caches. A 70GB database with a 4KB page size has ≈18 million pages; a 100,000-page cache (≈400MB) caches 0.55% of the database.
  4. Validate Settings Across Connections: Ensure configurations are reapplied in pooled or reused connections.
  5. Benchmark and Iterate: Adjust cache_size based on performance metrics and workload patterns.

By adhering to these guidelines, users can eliminate the performance disparity between main and ATTACHed databases, ensuring optimal utilization of SQLite’s page cache across multi-database environments.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *