Assertion Failure and Global Buffer Overflow in SQLite Base64 Functions

Issue Overview: Assertion Failure in sqlite3_result_blob and Global Buffer Overflow in fromBase64

The core issue revolves around two interrelated problems in SQLite when executing specific queries involving the base64 and regexp_bytecode functions. These problems manifest under distinct compilation environments and toolchains but share a common root in memory safety violations.

Assertion Failure in sqlite3_result_blob

The first symptom is an assertion failure triggered by the SQLite C API function sqlite3_result_blob, which enforces that the n parameter (representing the size of a blob) must be non-negative (n >= 0). The failure occurs when the base64 function passes a negative value for n. This assertion is a safeguard against invalid memory operations, as negative sizes are nonsensical in this context. The stack trace indicates that the base64 function (defined in shell.c) is the immediate caller of sqlite3_result_blob, implying that the error originates from incorrect handling of input data or memory allocation within the base64 implementation.

Global Buffer Overflow in fromBase64

The second symptom is a global buffer overflow detected by AddressSanitizer (ASAN) in the fromBase64 function. This overflow occurs when decoding base64-encoded data, specifically when accessing a static lookup table (nboi) used for mapping base64 characters to their corresponding 6-bit values. The overflow is caused by an out-of-bounds read due to incorrect bounds checking or misaligned memory access. The ASAN report identifies the overflow as occurring near global variables one and nboi, which are adjacent in memory. This adjacency allows the overflow to corrupt or read unintended data, leading to undefined behavior.

Relationship Between the Two Issues

The assertion failure and buffer overflow are linked through their dependence on the base64 and fromBase64 functions. The negative n value passed to sqlite3_result_blob likely stems from an invalid calculation in base64, which may itself be influenced by corrupted data from the buffer overflow in fromBase64. For example, if fromBase64 returns a malformed blob (due to reading beyond its lookup table), base64 might misinterpret its length, resulting in a negative size being passed to sqlite3_result_blob.

Possible Causes: Invalid Size Calculations and Static Buffer Misalignment

1. Incorrect Handling of Out-of-Memory (OOM) Conditions in base64

The base64 function retrieves input data using sqlite3_value_blob or sqlite3_value_text. If these functions return NULL (e.g., due to an OOM error), the subsequent logic in base64 must handle this gracefully. A failure to check for NULL returns could lead to invalid pointer arithmetic or size calculations. For instance, if sqlite3_value_blob returns NULL, the code might incorrectly derive a blob size from a NULL pointer, resulting in a negative or garbage value for n.

2. Buffer Overflow in fromBase64 Lookup Table

The fromBase64 function uses a static lookup table (nboi) to decode base64 characters. The ASAN report indicates that the overflow occurs when accessing this table, specifically at an offset 59 bytes beyond the global variable one. This suggests that the lookup table (nboi) is placed immediately after one in memory, and the code is accessing an index beyond the bounds of nboi. The root cause could be:

  • Insufficient Bounds Checking: The code may fail to validate that the input character is a valid base64 character (A-Z, a-z, 0-9, +, /) before indexing into nboi. Invalid characters could result in indices outside the table’s range.
  • Misalignment Due to Compiler Optimizations: The AFL compiler (afl-clang-fast) or ASAN-instrumented builds might alter the memory layout of global variables, exacerbating latent buffer overflows that were previously harmless due to padding or alignment.

3. Negative Size Calculation in base64

The assertion n >= 0 in sqlite3_result_blob implies that the base64 function is passing a negative value for the blob size. This could occur if:

  • Integer Underflow: The size calculation for the decoded blob (e.g., (nc * 3) / 4) results in a negative value due to an invalid nc (number of characters). This might happen if nc is derived from corrupted data.
  • Incorrect Use of Signed Integers: The code uses signed integers (int) for sizes, which can underflow if intermediate calculations produce negative values. Switching to unsigned types (e.g., size_t) would prevent this.

Troubleshooting Steps, Solutions & Fixes

Step 1: Analyze the base64 and fromBase64 Functions

Begin by inspecting the implementation of base64 and fromBase64 in shell.c. Key areas to examine include:

  • Input Validation: Ensure that sqlite3_value_blob/text return values are checked for NULL before processing.
  • Bounds Checking in fromBase64: Verify that all input characters are valid base64 characters before indexing into nboi.
  • Size Calculations: Validate that the calculated size of the decoded blob (nb) is non-negative and does not underflow.

Example Code Snippet (Before Fix):

static void base64(
  sqlite3_context *context,
  int na,
  sqlite3_value **av
){
  const char *zText = (const char*)sqlite3_value_text(av[0]);
  int nc = zText ? strlen(zText) : 0;
  // ... decoding logic ...
  sqlite3_result_blob(context, bBuf, nb, SQLITE_TRANSIENT);
}

Issue: If zText is NULL (due to OOM), nc is set to 0, but subsequent logic may still compute nb incorrectly.

Step 2: Fix OOM Handling in base64

Modify the base64 function to handle NULL returns from sqlite3_value_text or sqlite3_value_blob by returning an SQLite error or an empty result.

Fixed Code:

static void base64(
  sqlite3_context *context,
  int na,
  sqlite3_value **av
){
  const char *zText = (const char*)sqlite3_value_text(av[0]);
  if( zText == NULL ){
    sqlite3_result_error_nomem(context);
    return;
  }
  int nc = strlen(zText);
  // ... rest of the code ...
}

Step 3: Correct Buffer Overflow in fromBase64

The fromBase64 function must ensure that all input characters are within the valid range of the nboi lookup table. This involves:

  • Adding a bounds check before accessing nboi.
  • Using unsigned characters to avoid negative indices.

Example Fix:

static unsigned char nboi[] = {
    /* ... existing entries ... */
};

static int fromBase64(const char *zIn, int nIn, unsigned char *zOut){
  int i, j;
  unsigned char c;
  for(i=j=0; i<nIn; i++){
    c = (unsigned char)zIn[i];
    if( c < 0 || c >= sizeof(nboi) || nboi[c] == 0xFF ){
      // Invalid character; handle error
      return -1;
    }
    // ... decoding logic ...
  }
  return j;
}

Step 4: Ensure Non-Negative Size in sqlite3_result_blob

Replace signed integers with unsigned types for size calculations to prevent underflow. For example, use size_t instead of int for nc and nb.

Modified Code:

static void base64(
  sqlite3_context *context,
  int na,
  sqlite3_value **av
){
  const char *zText = (const char*)sqlite3_value_text(av[0]);
  if( zText == NULL ){ /* handle OOM */ }
  size_t nc = strlen(zText);
  size_t nb = (nc * 3) / 4; // Use unsigned arithmetic
  // ... validate nb ...
  sqlite3_result_blob(context, bBuf, (int)nb, SQLITE_TRANSIENT);
}

Step 5: Apply SQLite Patches

The SQLite team addressed these issues in specific commits:

  • Commit e6f9c0b1f963033a: Fixes OOM handling in base64 and base85 functions.
  • Commit 8f637aae23e6638c: Corrects alignment and buffer sizing in fromBase64.

Apply these patches or update to a SQLite version that includes them. Verify the fixes by recompiling with AFL and ASAN, then re-running the test queries.

Step 6: Validate with AFL and ASAN

After applying the fixes, recompile SQLite with both AFL and ASAN to confirm that the assertion failure and buffer overflow are resolved:

# AFL build
CC=afl-clang-fast ./configure --enable-debug --enable-all
make

# ASAN build
CFLAGS="-fsanitize=address" LDFLAGS="-fsanitize=address" ./configure --enable-debug
make

Execute the problematic query to ensure no crashes or sanitizer errors occur.

Step 7: Regression Testing

Implement regression tests that cover:

  • Invalid base64 input (e.g., non-base64 characters).
  • OOM conditions during sqlite3_value_text calls.
  • Edge cases like empty strings or strings with padding characters (=).

Sample Test Case:

-- Test invalid base64 characters
SELECT base64('invalid@char');

-- Test OOM handling (simulate with a memory-constrained environment)
PRAGMA soft_heap_limit=1;
SELECT base64('test');

By systematically addressing the root causes—OOM mishandling, buffer overflows, and integer underflows—the assertion failure and global buffer overflow can be reliably resolved.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *