Extracting SQLite Archives to Custom Directories with Path Sanitization

Issue Overview: Extracting SQLite Archives to Custom Paths with Security Constraints

The core challenge revolves around extracting files from an SQLite Archive (SQLAR) while prepending a custom directory path to each extracted file and enforcing security constraints. The user’s objective is twofold:

  1. Path Adjustment: Ensure files are extracted to a target directory (e.g., mydir/myfile.txt instead of ./myfile.txt).
  2. Security Filtering: Prevent extraction of files with absolute paths (e.g., /etc/passwd) or paths containing directory traversal patterns (e.g., ../).

SQLite’s built-in archive management commands (.ar or -A CLI options) provide partial solutions, but their behavior depends on nuanced configurations such as the -C (directory) flag, the structure of the sqlar table, and the use of SQL functions like writefile() and sqlar_uncompress(). Misunderstandings about parameter substitution (e.g., $dir, $dirOnly), GLOB pattern matching, and the internal mechanics of the SQLite shell’s archive extraction logic further complicate the issue.

This guide dissects the problem into actionable components, explores why default behaviors might not meet requirements, and provides step-by-step solutions ranging from simple CLI usage to advanced manual query customization.


Possible Causes: Why Extracted Paths or Security Checks Fail

1. Misuse of the -C (Directory) Flag

The -C or --directory option specifies the base directory for extraction. However, its interaction with the sqlar.name field is not always intuitive. For example, if sqlar.name contains subdirectories (e.g., docs/file.txt), -C mydir places the file at mydir/docs/file.txt. If the goal is to prepend mydir to all extracted paths (regardless of their structure), additional path manipulation is required.

2. Incomplete Path Sanitization in SQL Queries

The default extraction query includes name NOT GLOB '*..[/\]*' to block paths containing ../ or ..\. However, this pattern:

  • Fails to block paths like abc../ (a valid directory name in some systems).
  • Does not check for absolute paths starting with /.

3. Variable Substitution in Manual Queries

When running extraction queries manually (e.g., in the SQLite shell), internal variables like $dir (set by -C) and $dirOnly (used for directory creation order) are not populated. This leads to silent failures or incorrect paths.

4. Incorrect Use of writefile() Parameters

The writefile() function’s documentation initially shows two arguments (path, content), but the extraction process uses four:

writefile(path, content, mode, mtime)  

Omitting mode (file permissions) and mtime (modification time) can result in incorrect file metadata or extraction failures for directories (which require mtime reset after creation).

5. Verbose Output Misconfiguration

Using --verbose instead of -v or omitting it entirely prevents visibility into the extraction process, making debugging difficult.


Troubleshooting Steps, Solutions & Fixes

Step 1: Basic Extraction with Custom Directory

Command-Line Solution (Recommended):
Use the -C flag to set the target directory. This appends the directory to all extracted paths:

sqlite3 example.sqlar -Axv -C mydir  
  • -A: Enable archive mode.
  • x: Extract files.
  • v: Verbose output (prints each extracted path).
  • -C mydir: Extract files into mydir/.

Verification:

ls -l mydir/  

If files are missing, check if sqlar.name contains subdirectories. For example, if sqlar.name is docs/file.txt, the extracted path will be mydir/docs/file.txt.

Step 2: Blocking Absolute Paths and Directory Traversals

Modify the Extraction Query:
The default query filters paths with .. but misses absolute paths. Append AND name NOT GLOB '/*' to block absolute paths:

sqlite3 example.sqlar -Axn | sed 's/WHERE (1)/WHERE (1) AND name NOT GLOB "\/*"/' | sqlite3 example.sqlar -Ax  
  • -n: Dry-run to output the SQL query.
  • sed: Edits the query to add the absolute path filter.

Manual Query Execution:

SELECT 
  ($dir || name) AS target_path,
  writefile(
    ($dir || name),
    sqlar_uncompress(data, sz),
    mode,
    mtime
  ) 
FROM sqlar 
WHERE 
  (1) 
  AND (data IS NULL OR $dirOnly = 0) 
  AND name NOT GLOB '*..[/\]*' 
  AND name NOT GLOB '/*';  

Step 3: Handling Manual Variable Substitution

Replicating $dir and $dirOnly:
When running queries outside the -A command, replace $dir and $dirOnly manually:

-- Extract files to 'mydir/'  
SELECT 
  ('mydir/' || name),
  writefile(
    'mydir/' || name,
    sqlar_uncompress(data, sz),
    mode,
    mtime
  ) 
FROM sqlar 
WHERE 
  (data IS NULL OR 0 = 0)  -- $dirOnly = 0  
  AND name NOT GLOB '*..[/\]*' 
  AND name NOT GLOB '/*';  

-- Create directories (run second)  
SELECT 
  ('mydir/' || name),
  writefile(
    'mydir/' || name,
    NULL,  -- Directories have NULL data  
    mode,
    mtime
  ) 
FROM sqlar 
WHERE 
  (data IS NULL OR 1 = 0)  -- $dirOnly = 1  
  AND name NOT GLOB '*..[/\]*' 
  AND name NOT GLOB '/*';  

Step 4: Python Script for Advanced Control

Script Overview:

  • Connects to the SQLAR database.
  • Extracts files to a custom directory.
  • Skips absolute paths and invalid names.

Code:

import sqlite3  
import os  
import sys  

def extract_sqlar(db_path, target_dir='.'):  
    conn = sqlite3.connect(db_path)  
    cursor = conn.cursor()  
    cursor.execute("SELECT name, mode, mtime, data, sz FROM sqlar")  

    for name, mode, mtime, data, sz in cursor.fetchall():  
        # Skip absolute paths  
        if name.startswith('/'):  
            print(f"Skipping absolute path: {name}", file=sys.stderr)  
            continue  

        # Skip directory traversals  
        if '..' in name.split('/'):  
            print(f"Skipping invalid path: {name}", file=sys.stderr)  
            continue  

        target_path = os.path.join(target_dir, name)  
        os.makedirs(os.path.dirname(target_path), exist_ok=True)  

        # Write file (handle directories)  
        if data is None:  
            os.makedirs(target_path, exist_ok=True)  
        else:  
            content = sqlite3.Binary(data)  
            if sz != len(data):  # Compressed data  
                # Implement sqlar_uncompress() logic here if needed  
                pass  
            with open(target_path, 'wb') as f:  
                f.write(content)  

        # Set permissions and mtime  
        os.chmod(target_path, mode)  
        os.utime(target_path, (mtime, mtime))  

    conn.close()  

if __name__ == '__main__':  
    extract_sqlar('example.sqlar', 'mydir')  

Step 5: Validating Archive Contents Pre-Extraction

Check for Absolute Paths:

sqlite3 example.sqlar -A "SELECT name FROM sqlar WHERE name GLOB '/*'"  

If output exists, the archive contains absolute paths.

Check for Directory Traversals:

sqlite3 example.sqlar -A "SELECT name FROM sqlar WHERE name GLOB '*..[/\]*'"  

Step 6: Debugging Verbose Output

Correct Verbose Usage:

sqlite3 example.sqlar -Axv -C mydir  

Outputs each extracted path:

/mydir/file1.txt  
/mydir/docs/file2.txt  

Common Pitfalls:

  • Using --verbose instead of -v causes errors.
  • Verbose output only shows paths after variable substitution (e.g., $dir || name).

This guide systematically addresses path customization, security filtering, and variable handling in SQLite archive extraction. By combining CLI flags, query modifications, and scripting, users can achieve precise control over the extraction process.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *