Memory Overrun in SQLite URI Handling and Filename Parsing
Memory Corruption Risks in SQLite URI Parameter Handling and Filename Extraction
Database Engine Behavior During Filename Processing and URI Parameter Extraction
The core issue revolves around how SQLite processes database filenames provided via the sqlite3_open_v2()
API when they contain URI parameters. Specifically, two critical operations collide to create memory safety risks:
Buried Filename Extraction via Backward Memory Traversal:
ThedatabaseName()
function attempts to locate the "true" database filename by scanning backward from the input string’s memory address to find four consecutive zero bytes. This mechanism assumes the input string resides within a memory region with sufficient padding before its starting address to prevent out-of-bounds access.URI Parameter Parsing with Heap-Allocated Strings:
When a URI-formatted filename (e.g.,file:data.db?mode=ro
) is passed tosqlite3_open_v2()
, thesqlite3ParseUri()
function allocates a modified version of the filename usingsqlite3_malloc64()
. The heap manager may place this allocation near the edge of a valid memory block, leaving insufficient preceding bytes for safe backward scanning bydatabaseName()
.
These operations intersect when URI parameters are accessed via sqlite3_uri_parameter()
or sqlite3_uri_key()
, which internally invoke databaseName()
on the modified filename string. If the heap-allocated string lacks guard bytes before its starting address, backward scanning triggers address sanitizer errors or silent memory corruption.
Heap Allocation Boundaries and String Processing Assumptions
Three primary factors contribute to the memory overrun:
A. Absence of Guard Bytes in Heap-Allocated URI Strings
SQLite’s memory allocator (sqlite3_malloc64()
) returns blocks aligned to 8-byte boundaries but does not guarantee the availability of initialized or writable memory before the allocated region. When sqlite3ParseUri()
creates a modified filename string, it writes the new string starting at the first byte of the allocated block. Subsequent calls to databaseName()
on this string will scan backward from its starting address, potentially accessing unallocated heap memory or guard pages.
B. Reliance on Application-Provided Filename Memory Layout
The databaseName()
function assumes that the input filename pointer originates from a VFS implementation’s xOpen()
method, which SQLite’s default VFS stores in a memory region with four zero bytes preceding the string. This assumption breaks when the string is heap-allocated by sqlite3ParseUri()
, as the allocator provides no such padding.
C. Unbounded Forward Scanning in sqlite3Strlen30()
After extracting the buried filename via backward scanning, sqlite3Strlen30()
computes the string length by scanning forward until a zero byte is found. If the backward scan incorrectly identifies the start of the string (due to missing guard bytes), the forward scan may read beyond the allocated buffer’s end.
Mitigating Memory Overruns in Filename Handling Routines
Step 1: Validate Heap-Allocated URI String Memory Boundaries
Modify sqlite3ParseUri()
to ensure heap-allocated filenames include leading guard bytes:
char *sqlite3ParseUri(const char *zUri) {
size_t len = strlen(zUri);
/* Allocate extra 8 bytes: 4 for leading guard, 4 for trailing null */
char *zOut = sqlite3_malloc64(len + 8);
if(zOut){
memset(zOut, 0, 8); /* Pre-populate guard bytes */
zOut += 4; /* Start string after guard */
memcpy(zOut, processedUri, len);
zOut[len] = 0; /* Traditional null terminator */
zOut[len+1] = 0; /* Extra padding for safety */
}
return zOut;
}
This ensures four zero bytes precede the parsed filename string, satisfying databaseName()
‘s backward scan requirement.
Step 2: Hardened Backward Scanning in databaseName()
Implement boundary checks during backward traversal:
static const char *databaseName(const char *zName) {
const char *z = zName;
int i;
/* Limit backward scan to 4096 bytes to prevent runaway searches */
for(i=0; i<4096 && (z-- >= (const char*)sqlite3MemoryBaseAddr(z)); i++){
if( z[0]==0 && z[-1]==0 && z[-2]==0 && z[-3]==0 ){
return &z[-3];
}
}
return zName; /* Fallback if guard bytes not found */
}
Replace sqlite3MemoryBaseAddr(z)
with a platform-specific function returning the lowest valid heap address. This prevents scanning beyond the heap’s start.
Step 3: Bounded String Length Calculation
Modify sqlite3Strlen30()
to accept an optional maximum length parameter:
int sqlite3Strlen30(const char *z, int max_len){
int n = 0;
if( z ) while( *z++ && n < max_len ) n++;
return n;
}
Update call sites processing potentially unsafe strings to supply max_len
based on known allocation sizes from sqlite3ParseUri()
.
Step 4: VFS-Level Validation of Filename Storage
Enforce that VFS implementations storing filenames for later retrieval via sqlite3_uri_parameter()
reserve four leading zero bytes:
int sqlite3OsOpen(
sqlite3_vfs *pVfs,
const char *zName,
sqlite3_file *pFile,
int flags,
int *pOutFlags
){
/* Allocate filename with leading guard bytes */
char *zStoredName = sqlite3_malloc64(strlen(zName) + 8);
memset(zStoredName, 0, 8);
strcpy(zStoredName + 4, zName);
/* Store zStoredName+4 in VFS-specific structure */
}
This guarantees that when sqlite3_uri_parameter()
retrieves the filename from the VFS, databaseName()
can safely scan backward.
Step 5: Deprecate Unsafe URI Parameter Functions
Introduce new APIs that explicitly handle buffer boundaries:
/* Replacement for sqlite3_uri_parameter() with length checking */
const char *sqlite3_uri_parameter_safe(
const char *zFilename, /* Original database filename */
int nFilename, /* Bytes in zFilename buffer */
const char *zParam /* Parameter name */
);
Update internal callers to use safe variants, phasing out the original functions in future releases.
Step 6: Static Analysis Rules for SQLite Internals
Implement custom Clang analyzer checks to detect:
- Calls to
databaseName()
on non-VFS-managed strings - Missing length parameters in
sqlite3Strlen30()
invocations - Heap allocations of URI strings without leading guard bytes
Integrate these checks into SQLite’s build process to prevent regressions.
Long-Term Architectural Considerations
Eliminate Backward Scanning Requirement:
Modify the VFS filename storage format to prepend an explicit length prefix or offset value instead of relying on magic guard bytes. For example:struct VfsFileName { int iBuriedNameOffset; /* Offset to "real" filename */ char zData[]; /* Contains original URI string */ };
This allows direct computation of the buried name position without memory traversal.
Heap Allocator Integration:
Enhancesqlite3_malloc64()
to optionally reserve leading/trailing guard regions for sensitive allocations. Add a flag parameter:void *sqlite3_malloc64_ex( sqlite3_uint64 n, /* Bytes to allocate */ unsigned flags /* SQLITE_MALLOC_GUARDED etc. */ );
Platform-Specific Memory Layout Profiling:
Develop test harnesses that validate filename handling across memory allocator configurations (dlmalloc, jemalloc, tcmalloc) and address space layouts (32-bit vs 64-bit, ASLR settings).
By combining immediate hardening measures with strategic architectural changes, SQLite can eliminate memory safety risks in URI processing while maintaining backward compatibility for most use cases. Critical systems should adopt the safe API variants and enable static analysis checks during their SQLite integration processes.