Segmentation Fault in SQLite 3.38.1 When Accessing Zero-Length ZIP Blob
Crash Triggered by ZIP File Functions with Zero-Length Input
Issue Overview
The core issue involves a segmentation fault (SEGV) occurring in SQLite version 3.38.1 when attempting to process a zero-length BLOB input through the zipfile()
table-valued function. The crash manifests during execution of the query:
SELECT zipfile('test.zip'), mtime, data, method FROM zipfile(zeroblob('test.zip'));
This query passes a zero-length BLOB (generated via zeroblob()
) to the zipfile
virtual table, which attempts to parse it as a ZIP archive. The AddressSanitizer (ASAN) report indicates a fatal memory access violation in the fseek
function, triggered by an attempt to read from the zero page (address 0x000000000000
). The stack trace points to the zipfileReadEOCD
function (in shell.c:7852
), which is part of SQLite’s ZIP file handling logic.
The ZIP file format requires a valid "End of Central Directory" (EOCD) record to locate files within the archive. When the input is a zero-length BLOB, the code attempts to read this critical metadata from a non-existent file structure. The fseek
call fails because the file pointer is invalid or the underlying buffer (a zero-length BLOB) has no meaningful data to seek through. This results in a null pointer dereference or an invalid memory access, crashing the process.
The root problem lies in the lack of input validation for the BLOB passed to the zipfile
virtual table. The function assumes that the input is a valid ZIP archive, but when given empty or malformed data, it proceeds with operations that require a properly structured file. This oversight leads to undefined behavior when parsing impossible offsets or file positions.
Failure Modes in ZIP Archive Parsing Logic
Invalid ZIP File Structure
Thezeroblob('test.zip')
generates a BLOB of zero bytes, which is not a valid ZIP archive. A ZIP file must contain at least an EOCD record (22 bytes minimum), including metadata such as the number of disks, the offset to the central directory, and a signature (0x06054b50
). When the input is empty, the code inzipfileReadEOCD
attempts to locate the EOCD by reading backward from the end of the file. Since there is no data, the calculated offset becomes a negative or invalid value, leading to an out-of-bounds memory access.Missing Error Handling in
fseek
and File Operations
ThezipfileReadEOCD
function usesfseek
to navigate the input file/BLOB. When the input is zero-length,fseek
is called with an offset that exceeds the file’s bounds. The return value offseek
is not checked for errors, allowing execution to proceed even when the file position is invalid. Subsequent reads (viafread
) then operate on an invalid file pointer, causing a segmentation fault.Assumption of Non-Empty Input in Virtual Table Implementation
Thezipfile
virtual table, implemented inshell.c
, assumes that its input is a valid ZIP archive. It does not handle edge cases such as empty inputs, partially written archives, or non-ZIP files. ThezipfileLoadDirectory
function (called byzipfileFilter
) attempts to load the central directory entries without first verifying that the EOCD exists or that the file size is sufficient to contain ZIP structures.
Resolving the Crash: Input Validation and Boundary Checks
Step 1: Apply the Official Patch
The SQLite development team addressed this issue in the commit referenced in the forum reply. The fix involves adding a check for the input file size before attempting to read the EOCD. If the file size is less than the minimum required for a valid ZIP archive (22 bytes), the function immediately returns an error instead of proceeding with invalid operations.
Code Change Example:
// In zipfileReadEOCD (shell.c):
i64 nByte = fileSize(pFile);
if( nByte < 22 ){
return SQLITE_ERROR; // Abort if file too small
}
This prevents fseek
from being called with invalid offsets when the input is empty or too small.
Step 2: Validate Inputs to zipfile
Functions
When using the zipfile
virtual table, ensure that inputs are valid ZIP archives. For programmatic use cases, add pre-checks:
-- Check if the BLOB is a valid ZIP file before querying:
SELECT
CASE
WHEN length(zip_data) >= 22 AND substr(zip_data, -22) LIKE 'PK\x05\x06%'
THEN 1
ELSE 0
END AS is_valid_zip
FROM ...;
This uses SQL to verify that the input has at least 22 bytes and ends with the EOCD signature (PK\x05\x06
).
Step 3: Handle Empty Inputs Gracefully in Application Code
If your application allows users to supply arbitrary BLOBs to zipfile
, wrap the operation in a TRY...CATCH
block (if using a host language) or check the BLOB size before invoking the virtual table:
-- In SQLite (using a CTE to pre-filter):
WITH input(zip_blob) AS (
SELECT zeroblob('test.zip') WHERE length(zeroblob('test.zip')) >= 22
)
SELECT zipfile('test.zip'), mtime, data, method
FROM input, zipfile(input.zip_blob);
This skips the zipfile
call entirely if the input is too small.
Step 4: Upgrade to a Fixed SQLite Version
The crash is specific to SQLite 3.38.1. Upgrade to version 3.38.2 or later, which includes the patch. Verify the fix by re-running the original query:
SELECT zipfile('test.zip'), mtime, data, method FROM zipfile(zeroblob('test.zip'));
With the patch, this should return an error (e.g., "malformed ZIP file") instead of crashing.
Step 5: Audit Uses of zeroblob()
and randomblob()
These functions generate BLOBs without structure validation. Review all code that processes BLOBs as structured files (ZIP, images, etc.) and ensure proper validation precedes parsing.
Step 6: Enable SQLite’s Debugging Aids
Compile SQLite with -DSQLITE_DEBUG
to activate internal sanity checks. Combine this with AddressSanitizer to catch memory errors early:
CFLAGS="-fsanitize=address -DSQLITE_DEBUG" ./configure
make
This helps identify invalid pointer dereferences and other memory safety issues during development.
Step 7: Implement Custom ZIP Validation in SQLite Extensions
For advanced use cases, extend the zipfile
virtual table to include rigorous input validation. Override the xFilter
method to reject invalid inputs:
static int zipfileFilter(
sqlite3_vtab_cursor *pVtabCursor,
int idxNum, const char *idxStr,
int argc, sqlite3_value **argv
){
ZipfileCsr *pCsr = (ZipfileCsr*)pVtabCursor;
// Validate input BLOB before proceeding
if( sqlite3_value_type(argv[0]) == SQLITE_BLOB ){
const void *pData = sqlite3_value_blob(argv[0]);
int nData = sqlite3_value_bytes(argv[0]);
if( nData < 22 || !zipfileIsValidEOCD(pData, nData) ){
return SQLITE_ERROR;
}
}
// Proceed with original logic...
}
This custom validation ensures that only plausibly valid ZIP files are processed.
Final Notes
The crash stems from a lack of input sanitation in SQLite’s ZIP file handling. By validating inputs, checking return values of file operations, and upgrading to patched versions, developers can avoid this class of errors. Always treat BLOBs as untrusted data until proven otherwise, especially when parsing them as structured formats.