Incorrect Directory Traversal Filtering in SQLite Archive Extraction
Issue Overview: Blocked Valid Filenames Due to Overly Restrictive GLOB Pattern
The core problem revolves around SQLite’s handling of filenames during archive extraction operations when using the arExtractCommand()
function. The current implementation contains a security filter designed to prevent directory traversal attacks through a GLOB pattern match. However, this pattern incorrectly blocks legitimate filenames containing consecutive dots followed by directory separators in non-malicious contexts.
A concrete example demonstrates the flaw: A filename like And so it begins.../script.txt
gets erroneously filtered out by the existing pattern *..[/\\]*
, despite representing valid user content rather than a directory traversal attempt. The pattern’s wildcard placement causes false positives by matching any occurrence of two dots followed by directory separator characters (/
or \
), regardless of their position in the path hierarchy.
This issue manifests specifically in SQLite’s archive mode operations when extracting files from SQLAR virtual tables. The security check in arExtractCommand()
aims to prevent malicious paths like ../../etc/passwd
but inadvertently impacts files with non-malicious sequences of dots in their names. The pattern matching logic fails to distinguish between directory traversal patterns and benign filename components containing ..
sequences followed by path separators.
Possible Causes: Pattern Matching Logic Flaws and Security Tradeoffs
Three primary factors contribute to this incorrect filtering behavior:
Overly Broad Wildcard Usage in GLOB Pattern
The original filter*..[/\\]*
uses leading and trailing wildcards that match any occurrence of..
followed by a directory separator anywhere in the path. This catches both legitimate patterns (like.../file.txt
where three dots precede a separator) and malicious patterns (like../malicious.sh
). The wildcards fail to isolate directory traversal attempts occurring at the start or middle of paths.Insufficient Pattern Differentiation Between Attack Vectors
Directory traversal attacks typically follow two patterns:- Paths starting with
../
(relative parent directory navigation) - Paths containing
/../
mid-path (absolute path manipulation)
The original single GLOB pattern treats these scenarios identically while also matching non-attack filename patterns. This lacks precision in distinguishing actual security threats from normal filename variations.
- Paths starting with
Filename Validation vs. Security Filtering Conflicts
The current implementation prioritizes security filtering over filename validity checks, creating a situation where security measures inadvertently become overzealous. Files containing multiple dots in artistic or descriptive names (common in document management systems) become casualties of pattern matching designed for a different purpose.
Troubleshooting Steps, Solutions & Fixes: Refining Path Validation Logic
Step 1: Analyze Existing Pattern Matching Behavior
Execute test queries against sample data to observe pattern matching outcomes:
-- Create test dataset
CREATE TABLE path_samples(name TEXT);
INSERT INTO path_samples VALUES
('malicious/../../system.txt'),
('../sensitive.log'),
('And so it begins.../chapter1.txt'),
('archive/2023../report.pdf');
-- Original problematic filter
SELECT name FROM path_samples
WHERE name NOT GLOB '*..[/\]*';
This returns zero results, demonstrating how valid filenames get excluded alongside actual threats.
Step 2: Implement Multi-Clause Pattern Validation
Replace the single GLOB pattern with separate checks for distinct attack vectors:
-- Revised filter using compound conditions
SELECT name FROM path_samples
WHERE
name NOT GLOB '..[/\]*' -- Block paths starting with ../
AND
name NOT GLOB '*[/\]..[/\]*'; -- Block paths containing /../
This version allows And so it begins.../chapter1.txt
while blocking true malicious patterns. The compound approach isolates directory traversal patterns without overmatching.
Step 3: Update arExtractCommand() Source Code
Modify the SQL template in SQLite’s source code (src/shell.c.in):
/* Original problematic code */
" AND name NOT GLOB '*..[/\\]*'";
/* Revised validation logic */
" AND name NOT GLOB '..[/\\]*'"
" AND name NOT GLOB '*[/\\]..[/\\]*'";
This change splits the security check into two targeted conditions:
..[/\\]*
catches paths beginning with directory traversal sequences*[/\\]..[/\\]*
catches mid-path traversal attempts
Step 4: Comprehensive Testing Strategy
Implement a test matrix covering edge cases:
Test Case | Expected Result | Purpose |
---|---|---|
../../etc/passwd | Blocked | Basic relative path attack |
src/../include/config.h | Blocked | Mid-path traversal |
document.../file.txt | Allowed | Valid triple-dot filename |
backup../archive.zip | Allowed | Suffixed dots before separator |
C:\..\system32\ | Blocked | Windows-style path traversal |
.../safe_file.dat | Allowed | Triple-dot directory prefix |
Step 5: Address Cross-Platform Path Separators
Ensure pattern compatibility with both UNIX and Windows path conventions by:
- Using
[/\\]
character classes to match both forward and backslashes - Testing with mixed path separators in filenames
- Accounting for platform-specific path normalization routines
Step 6: Implement Defense-in-Depth Measures
Supplement pattern matching with additional safeguards:
- Absolute Path Sanitization
Reject absolute paths starting with/
(UNIX) or drive letters (Windows):AND name NOT GLOB '/*' -- UNIX absolute paths AND name NOT GLOB '[A-Za-z]:\*' -- Windows absolute paths
- Directory Creation Guards
Validate extracted paths against an allow-list of permitted directories - Filesystem Sandboxing
Use chroot jails or containerized environments during extraction
Step 7: Performance Optimization Considerations
Evaluate query execution plans to ensure efficient pattern matching:
EXPLAIN QUERY PLAN
SELECT name FROM path_samples
WHERE
name NOT GLOB '..[/\]*'
AND name NOT GLOB '*[/\]..[/\]*';
Verify that SQLite utilizes indexed path checks where possible, though note that GLOB predicates generally prevent index usage. Consider maintaining a separate validation table with pre-computed path safety flags for large datasets.
Step 8: Document Security Boundaries
Clearly communicate the limitations of path filtering:
- This prevents directory traversal during extraction but doesn’t validate file contents
- Additional measures required for comprehensive security (checksums, permissions)
- Recommendation to use read-only filesystem mounts for sensitive operations
Final Implementation Checklist
- Replace single GLOB pattern with compound conditions
- Verify cross-platform separator handling
- Add absolute path blocking clauses
- Implement comprehensive test cases
- Review query performance characteristics
- Update security documentation accordingly
This multi-layered approach balances security requirements with filename flexibility, addressing both the immediate filtering issue and broader path validation concerns. Developers should integrate these changes while maintaining vigilance for new attack vectors that might require future pattern adjustments.