SQLite Lock-Byte Page Size and Database File Calculations

Lock-Byte Page Structure and Database Page Size Alignment

Issue Overview
The core issue revolves around the relationship between SQLite’s lock-byte page, database page size, and how these elements influence the total size of a database file. Specifically, the confusion arises from two interconnected questions:

  1. How does the lock-byte page behave when the database uses a page size larger than 512 bytes?
  2. How does the lock-byte page affect the calculation of the total database file size as reported in the database header?

The lock-byte page is a critical component of SQLite’s file-format design. It is defined as the single page containing bytes at offsets between 1,073,741,824 (1 GiB) and 1,073,742,335 (1 GiB + 512 bytes). This range spans 512 bytes. However, SQLite databases can be configured with page sizes ranging from 512 bytes to 65,536 bytes (always a power of two). When the page size exceeds 512 bytes, the lock-byte page occupies an entire page, but only the first 512 bytes of that page are used for locking purposes. The remaining bytes in the page are unused.

The second concern involves interpreting the "in-header" database size value, which specifies the total number of pages in the database file. The lock-byte page is included in this count, even though its size and usage differ from regular data pages. This leads to discrepancies when calculating the total file size, especially when the page size is larger than 512 bytes. For example, a database with 1,000,000 pages and a 4,096-byte page size would nominally be 4,096,000,000 bytes, but the presence of the lock-byte page introduces an adjustment.

Lock-Byte Page Implementation and Configuration Conflicts

Possible Causes

  1. Misinterpretation of Lock-Byte Page Definition: The documentation states that the lock-byte page "contains" the 512-byte range (1 GiB to 1 GiB + 512 bytes). This does not mean the page is limited to 512 bytes. Instead, the lock-byte page is a full database page (as per the configured page size) that overlaps with the 1 GiB offset. For example, a 4,096-byte page size means the lock-byte page spans offsets 1,073,741,824 to 1,073,745,919 (4,096 bytes), but only the first 512 bytes are used for locking.

  2. Page Size and Lock-Byte Page Alignment: SQLite enforces that the lock-byte page always starts at the 1 GiB boundary. Since page sizes are powers of two, the 1 GiB offset is guaranteed to align with the start of a page. For instance, with a 4,096-byte page size, the page starting at 1,073,741,824 is page number 262,144 (1 GiB / 4,096). This alignment ensures the lock-byte page is a single page, regardless of page size.

  3. Reserved Region Misconceptions: The documentation notes that the "reserved region" (space at the end of each page for extensions) does not apply to the lock-byte page. This does not imply that the lock-byte page is smaller than the configured page size. Instead, the entire lock-byte page is excluded from the reserved region mechanism. The unused portion of the lock-byte page (beyond the first 512 bytes) is neither part of the reserved region nor used for any other purpose.

  4. File Size Calculation Errors: The in-header database size (stored at offset 28 in the database header) counts all pages, including the lock-byte page. However, the lock-byte page may not exist in newly created databases until the file grows beyond 1 GiB. This creates confusion when translating the page count into total file size, especially when the lock-byte page is implicitly added.

Resolving Lock-Byte Page Size Conflicts and File Size Discrepancies

Troubleshooting Steps, Solutions & Fixes

Step 1: Clarify Lock-Byte Page Size and Composition
The lock-byte page is always one full page in size, as defined by the database’s page size (512, 1k, 2k, …, 64k). The 512-byte locking region is a subset of this page, starting at offset 0 within the page. For example:

  • Page size = 512 bytes: The entire page is the lock-byte region.
  • Page size = 4,096 bytes: The lock-byte region occupies the first 512 bytes; the remaining 3,584 bytes are unused.

This design ensures alignment with the 1 GiB boundary while accommodating varying page sizes.

Step 2: Validate Lock-Byte Page Creation and File Growth
The lock-byte page is created implicitly when the database file grows beyond 1 GiB. To test this:

  1. Create a database with PRAGMA page_size=4096.
  2. Insert data until the file exceeds 1 GiB.
  3. Use a hex editor to inspect offsets 1,073,741,824 to 1,073,745,919. The first 512 bytes will be the lock region; the rest will be zeroed.

This confirms that the lock-byte page exists as a full 4,096-byte page, with only the initial 512 bytes utilized.

Step 3: Correct File Size Calculations
The total file size is calculated as:

file_size = (page_count * page_size) + (lock_byte_page_exists ? page_size : 0)  

However, the in-header page count includes the lock-byte page if it exists. For example:

  • If the header reports 1,000,000 pages and page_size=4,096, the expected size is 1,000,000 * 4,096 = 4,096,000,000 bytes.
  • If the lock-byte page exists (i.e., file size >1 GiB), it is already counted in the page_count.

The confusion arises from the fact that the lock-byte page is counted as a regular page in the header but has a different internal structure.

Step 4: Address Reserved Region Exceptions
The reserved region (set via PRAGMA reserved_bytes) allocates space at the end of each page for extensions. This region does not apply to the lock-byte page. For example:

  • With PRAGMA reserved_bytes=64, a 4,096-byte page reserves the last 64 bytes.
  • The lock-byte page (4,096 bytes) does not reserve these 64 bytes; the entire page after the first 512 bytes remains unused.

This exception ensures the lock-byte page’s structure remains consistent across VFS implementations.

Step 5: Handling VFS-Specific Locking Mechanisms
While the lock-byte page exists in all database files larger than 1 GiB, its usage depends on the VFS. For example:

  • The default unix and win32 VFSes use byte-range locks on the lock-byte region.
  • In-memory or custom VFSes may ignore the lock-byte page entirely.

Regardless of VFS behavior, the lock-byte page is always allocated and counted in the total file size when required.

Step 6: Tools for Inspection and Verification
Use the sqlite3_analyzer tool or custom SQL queries to inspect page counts and verify lock-byte page inclusion:

PRAGMA page_size;  
PRAGMA page_count;  

Compare the theoretical file size (page_size * page_count) with the actual file size. Discrepancies may indicate unaccounted pages (e.g., write-ahead log or temporary journals), not lock-byte page errors.

Step 7: Avoiding Off-by-One Errors in File Size Scripts
When programmatically calculating database file sizes, always derive the lock-byte page’s existence from the file size itself:

page_size = db.header.page_size  
page_count = db.header.page_count  
expected_size = page_size * page_count  
if os.path.getsize(db.path) != expected_size:  
    # Check for lock-byte page or WAL files  

This avoids assumptions about whether the lock-byte page is included in the header’s page count.

Step 8: Addressing Edge Cases with Small Page Sizes
For databases with 512-byte pages, the lock-byte page is exactly the 512-byte region at 1 GiB. This means:

  • The lock-byte page is page number 2,097,152 (1 GiB / 512).
  • The entire page is used for locking, leaving no unused space.

This edge case reinforces that the lock-byte page’s structure adapts to the configured page size while maintaining the 512-byte locking region.

Final Recommendations

  1. Always use PRAGMA page_count multiplied by PRAGMA page_size for theoretical file size calculations, acknowledging that the lock-byte page is included.
  2. Treat the lock-byte page as a full page in size, with only the first 512 bytes having a defined purpose.
  3. Validate file size discrepancies using tools like sqlite3_analyzer to rule out WAL files or temporary artifacts.

By adhering to these principles, developers can avoid miscalculations and misinterpretations of SQLite’s file structure, particularly when dealing with large databases and varying page sizes.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *