Reversing HEX Function in SQLite: Solutions for UNHEX Conversion Challenges


HEX Function Output Cannot Be Directly Converted Back to Original Data Type

The core issue revolves around reversing the output of SQLite’s HEX() function to retrieve the original value. Users discovered that while HEX() converts input (text, integers, blobs) into a hexadecimal string representation, there is no built-in inverse function (UNHEX()) in SQLite versions prior to 3.41. Attempts to cast hexadecimal strings back to their original types (e.g., using CAST(x'...' AS TEXT) or CAST(x'...' AS INTEGER)) yield inconsistent results due to SQLite’s type affinity system and implicit conversions. For example:

-- Example of failed reversal:
SELECT HEX(42);          -- Returns '3432' (hex for ASCII '42')
SELECT CAST(x'3432' AS INTEGER); -- Returns 42 (correct)
SELECT HEX('Price is 3€'); -- Returns '50726963652069732033E282AC'
SELECT CAST(x'50726963652069732033E282AC' AS TEXT); -- Returns 'Price is 3€' (correct)

However, when working with dynamically generated hex strings (e.g., stored in a table column), direct casting fails because SQLite treats the hex string as a text value rather than a blob literal. This leads to incorrect conversions or partial data recovery. Additionally, comparisons between BLOB and TEXT types fail even if their content is identical, complicating round-trip validation.


Absence of Native UNHEX Function and Data Type Mismatches

1. No Built-In UNHEX Function in Older SQLite Versions

SQLite’s HEX() function converts input into a hexadecimal string, but prior to version 3.41, there was no native UNHEX() function to reverse this process. The absence of this function forced users to rely on workarounds like recursive CTEs or custom extensions. The HEX() function’s output is always a TEXT value, which cannot be directly cast back to the original data type without knowing whether the source was TEXT, INTEGER, REAL, or BLOB.

2. Type Affinity and Implicit Conversions

SQLite uses dynamic typing and type affinity, which leads to unexpected behavior when reversing hex strings. For example:

  • Numeric Data: HEX(42) returns the hex representation of the string ’42’ (ASCII values 0x34 and 0x32), not the integer 42. Reversing it with CAST(x'3432' AS INTEGER) works only if the original data was stored as an integer.
  • Text and Blobs: Hex strings derived from TEXT may include multi-byte characters (e.g., ‘€’ in UTF-8), which require proper handling during conversion. Blobs, when converted to hex strings, lose their type information, making it impossible to distinguish them from text-derived hex strings without additional metadata.

3. Character Encoding and Binary Data Handling

Hex strings generated from BLOB values or multi-byte text (e.g., UTF-8) require precise decoding. For instance, the Euro symbol ‘€’ in UTF-8 is represented as 0xE282AC. Converting this back to text without specifying the encoding may result in corruption or placeholder characters (e.g., ‘?’ in some terminals). Similarly, binary data (e.g., RANDOMBLOB(16)) converted to hex and back to text will produce unprintable characters, leading to mismatches in equality checks.


Implementing UNHEX Functionality and Correct Type Handling

1. Using Recursive CTE for Hex-to-Text Conversion (Pre-3.41)

For SQLite versions without UNHEX(), a recursive CTE can decode hex strings into text. This method processes two hex characters at a time, converts them to their ASCII equivalents, and concatenates the results:

WITH RECURSIVE hex_decode(c, remaining) AS (
  SELECT '', '686F77647921' -- Input hex string
  UNION ALL
  SELECT 
    c || CHAR(
      (CASE SUBSTR(remaining, 1, 1) 
        WHEN 'A' THEN 10 WHEN 'B' THEN 11 WHEN 'C' THEN 12 
        WHEN 'D' THEN 13 WHEN 'E' THEN 14 WHEN 'F' THEN 15 
        ELSE CAST(SUBSTR(remaining, 1, 1) AS INTEGER) END) * 16 +
      (CASE SUBSTR(remaining, 2, 1) 
        WHEN 'A' THEN 10 WHEN 'B' THEN 11 WHEN 'C' THEN 12 
        WHEN 'D' THEN 13 WHEN 'E' THEN 14 WHEN 'F' THEN 15 
        ELSE CAST(SUBSTR(remaining, 2, 1) AS INTEGER) END)
    ),
    SUBSTR(remaining, 3)
  FROM hex_decode 
  WHERE remaining != ''
)
SELECT c FROM hex_decode WHERE remaining = '';

Explanation:

  • The CTE starts with an empty result string (c) and the input hex string.
  • Each iteration processes two hex characters, converts them to a decimal value, and appends the corresponding ASCII character to c.
  • The loop continues until the entire hex string is processed.

2. Custom UNHEX Extension for SQLite (C-Language)

For environments where extensions are feasible, a custom UNHEX() function can be implemented. This function converts a hex string into a BLOB:

#include <sqlite3ext.h>
SQLITE_EXTENSION_INIT1

static void unhexFunc(sqlite3_context *context, int argc, sqlite3_value **argv) {
  const unsigned char *hex = sqlite3_value_text(argv[0]);
  int len = sqlite3_value_bytes(argv[0]);
  if (len % 2 != 0) { return; }
  
  unsigned char *blob = sqlite3_malloc(len / 2);
  for (int i = 0; i < len; i += 2) {
    int high = (hex[i] >= 'A') ? (hex[i] - 'A' + 10) : (hex[i] - '0');
    int low = (hex[i+1] >= 'A') ? (hex[i+1] - 'A' + 10) : (hex[i+1] - '0');
    blob[i/2] = (high << 4) | low;
  }
  sqlite3_result_blob(context, blob, len/2, sqlite3_free);
}

int sqlite3_unhex_init(sqlite3 *db, char **pErr, const sqlite3_api_routines *pApi) {
  SQLITE_EXTENSION_INIT2(pApi);
  return sqlite3_create_function(db, "UNHEX", 1, SQLITE_UTF8, 0, unhexFunc, 0, 0);
}

Usage:

.load ./unhex_extension
SELECT UNHEX('686F77647921'); -- Returns 'howdy!' as BLOB
SELECT CAST(UNHEX('3432') AS INTEGER); -- Returns 42

3. Upgrading to SQLite 3.41+ for Native UNHEX Support

SQLite 3.41 introduced a built-in UNHEX() function that converts hex strings to blobs. This eliminates the need for workarounds:

SELECT UNHEX('50726963652069732033E282AC'); -- Returns 'Price is 3€' as BLOB
SELECT CAST(UNHEX(HEX(42)) AS INTEGER); -- Returns 42

Important Notes:

  • UNHEX() returns a BLOB, so explicit casting is required when comparing against TEXT or numeric columns.
  • Always use CAST(UNHEX(...) AS <type>) to ensure compatibility with the original data type.

4. Handling BLOB vs TEXT Comparisons

When reversing hex strings, ensure that the result’s data type matches the original. For example:

-- Incorrect: Comparing BLOB (UNHEX result) with TEXT (original column)
SELECT COUNT(*) FROM employees WHERE lastname = UNHEX(HEX(lastname)); -- Returns 0

-- Correct: Cast UNHEX result to TEXT
SELECT COUNT(*) FROM employees WHERE lastname = CAST(UNHEX(HEX(lastname)) AS TEXT); -- Returns 8

5. Terminal and Encoding Configuration

When working with multi-byte characters (e.g., ‘€’), ensure the terminal or client uses UTF-8 encoding. In Windows Command Prompt:

chcp 65001 # Set code page to UTF-8

6. Debugging Hex Conversions

Use .mode and .dump in the SQLite CLI to inspect binary data:

.mode quote
SELECT HEX('€'), UNHEX(HEX('€')); -- Shows hex and raw bytes

By addressing data type affinity, using native or custom UNHEX() functions, and ensuring proper encoding settings, users can reliably reverse hex strings in SQLite.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *