Extracting SQLite Archives to Custom Directories with Path Sanitization
Issue Overview: Extracting SQLite Archives to Custom Paths with Security Constraints
The core challenge revolves around extracting files from an SQLite Archive (SQLAR) while prepending a custom directory path to each extracted file and enforcing security constraints. The user’s objective is twofold:
- Path Adjustment: Ensure files are extracted to a target directory (e.g.,
mydir/myfile.txt
instead of./myfile.txt
). - Security Filtering: Prevent extraction of files with absolute paths (e.g.,
/etc/passwd
) or paths containing directory traversal patterns (e.g.,../
).
SQLite’s built-in archive management commands (.ar
or -A
CLI options) provide partial solutions, but their behavior depends on nuanced configurations such as the -C
(directory) flag, the structure of the sqlar
table, and the use of SQL functions like writefile()
and sqlar_uncompress()
. Misunderstandings about parameter substitution (e.g., $dir
, $dirOnly
), GLOB pattern matching, and the internal mechanics of the SQLite shell’s archive extraction logic further complicate the issue.
This guide dissects the problem into actionable components, explores why default behaviors might not meet requirements, and provides step-by-step solutions ranging from simple CLI usage to advanced manual query customization.
Possible Causes: Why Extracted Paths or Security Checks Fail
1. Misuse of the -C
(Directory) Flag
The -C
or --directory
option specifies the base directory for extraction. However, its interaction with the sqlar.name
field is not always intuitive. For example, if sqlar.name
contains subdirectories (e.g., docs/file.txt
), -C mydir
places the file at mydir/docs/file.txt
. If the goal is to prepend mydir
to all extracted paths (regardless of their structure), additional path manipulation is required.
2. Incomplete Path Sanitization in SQL Queries
The default extraction query includes name NOT GLOB '*..[/\]*'
to block paths containing ../
or ..\
. However, this pattern:
- Fails to block paths like
abc../
(a valid directory name in some systems). - Does not check for absolute paths starting with
/
.
3. Variable Substitution in Manual Queries
When running extraction queries manually (e.g., in the SQLite shell), internal variables like $dir
(set by -C
) and $dirOnly
(used for directory creation order) are not populated. This leads to silent failures or incorrect paths.
4. Incorrect Use of writefile()
Parameters
The writefile()
function’s documentation initially shows two arguments (path
, content
), but the extraction process uses four:
writefile(path, content, mode, mtime)
Omitting mode
(file permissions) and mtime
(modification time) can result in incorrect file metadata or extraction failures for directories (which require mtime
reset after creation).
5. Verbose Output Misconfiguration
Using --verbose
instead of -v
or omitting it entirely prevents visibility into the extraction process, making debugging difficult.
Troubleshooting Steps, Solutions & Fixes
Step 1: Basic Extraction with Custom Directory
Command-Line Solution (Recommended):
Use the -C
flag to set the target directory. This appends the directory to all extracted paths:
sqlite3 example.sqlar -Axv -C mydir
-A
: Enable archive mode.x
: Extract files.v
: Verbose output (prints each extracted path).-C mydir
: Extract files intomydir/
.
Verification:
ls -l mydir/
If files are missing, check if sqlar.name
contains subdirectories. For example, if sqlar.name
is docs/file.txt
, the extracted path will be mydir/docs/file.txt
.
Step 2: Blocking Absolute Paths and Directory Traversals
Modify the Extraction Query:
The default query filters paths with ..
but misses absolute paths. Append AND name NOT GLOB '/*'
to block absolute paths:
sqlite3 example.sqlar -Axn | sed 's/WHERE (1)/WHERE (1) AND name NOT GLOB "\/*"/' | sqlite3 example.sqlar -Ax
-n
: Dry-run to output the SQL query.sed
: Edits the query to add the absolute path filter.
Manual Query Execution:
SELECT
($dir || name) AS target_path,
writefile(
($dir || name),
sqlar_uncompress(data, sz),
mode,
mtime
)
FROM sqlar
WHERE
(1)
AND (data IS NULL OR $dirOnly = 0)
AND name NOT GLOB '*..[/\]*'
AND name NOT GLOB '/*';
Step 3: Handling Manual Variable Substitution
Replicating $dir
and $dirOnly
:
When running queries outside the -A
command, replace $dir
and $dirOnly
manually:
-- Extract files to 'mydir/'
SELECT
('mydir/' || name),
writefile(
'mydir/' || name,
sqlar_uncompress(data, sz),
mode,
mtime
)
FROM sqlar
WHERE
(data IS NULL OR 0 = 0) -- $dirOnly = 0
AND name NOT GLOB '*..[/\]*'
AND name NOT GLOB '/*';
-- Create directories (run second)
SELECT
('mydir/' || name),
writefile(
'mydir/' || name,
NULL, -- Directories have NULL data
mode,
mtime
)
FROM sqlar
WHERE
(data IS NULL OR 1 = 0) -- $dirOnly = 1
AND name NOT GLOB '*..[/\]*'
AND name NOT GLOB '/*';
Step 4: Python Script for Advanced Control
Script Overview:
- Connects to the SQLAR database.
- Extracts files to a custom directory.
- Skips absolute paths and invalid names.
Code:
import sqlite3
import os
import sys
def extract_sqlar(db_path, target_dir='.'):
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
cursor.execute("SELECT name, mode, mtime, data, sz FROM sqlar")
for name, mode, mtime, data, sz in cursor.fetchall():
# Skip absolute paths
if name.startswith('/'):
print(f"Skipping absolute path: {name}", file=sys.stderr)
continue
# Skip directory traversals
if '..' in name.split('/'):
print(f"Skipping invalid path: {name}", file=sys.stderr)
continue
target_path = os.path.join(target_dir, name)
os.makedirs(os.path.dirname(target_path), exist_ok=True)
# Write file (handle directories)
if data is None:
os.makedirs(target_path, exist_ok=True)
else:
content = sqlite3.Binary(data)
if sz != len(data): # Compressed data
# Implement sqlar_uncompress() logic here if needed
pass
with open(target_path, 'wb') as f:
f.write(content)
# Set permissions and mtime
os.chmod(target_path, mode)
os.utime(target_path, (mtime, mtime))
conn.close()
if __name__ == '__main__':
extract_sqlar('example.sqlar', 'mydir')
Step 5: Validating Archive Contents Pre-Extraction
Check for Absolute Paths:
sqlite3 example.sqlar -A "SELECT name FROM sqlar WHERE name GLOB '/*'"
If output exists, the archive contains absolute paths.
Check for Directory Traversals:
sqlite3 example.sqlar -A "SELECT name FROM sqlar WHERE name GLOB '*..[/\]*'"
Step 6: Debugging Verbose Output
Correct Verbose Usage:
sqlite3 example.sqlar -Axv -C mydir
Outputs each extracted path:
/mydir/file1.txt
/mydir/docs/file2.txt
Common Pitfalls:
- Using
--verbose
instead of-v
causes errors. - Verbose output only shows paths after variable substitution (e.g.,
$dir || name
).
This guide systematically addresses path customization, security filtering, and variable handling in SQLite archive extraction. By combining CLI flags, query modifications, and scripting, users can achieve precise control over the extraction process.