Reversing HEX Function in SQLite: Solutions for UNHEX Conversion Challenges
HEX Function Output Cannot Be Directly Converted Back to Original Data Type
The core issue revolves around reversing the output of SQLite’s HEX()
function to retrieve the original value. Users discovered that while HEX()
converts input (text, integers, blobs) into a hexadecimal string representation, there is no built-in inverse function (UNHEX()
) in SQLite versions prior to 3.41. Attempts to cast hexadecimal strings back to their original types (e.g., using CAST(x'...' AS TEXT)
or CAST(x'...' AS INTEGER)
) yield inconsistent results due to SQLite’s type affinity system and implicit conversions. For example:
-- Example of failed reversal:
SELECT HEX(42); -- Returns '3432' (hex for ASCII '42')
SELECT CAST(x'3432' AS INTEGER); -- Returns 42 (correct)
SELECT HEX('Price is 3€'); -- Returns '50726963652069732033E282AC'
SELECT CAST(x'50726963652069732033E282AC' AS TEXT); -- Returns 'Price is 3€' (correct)
However, when working with dynamically generated hex strings (e.g., stored in a table column), direct casting fails because SQLite treats the hex string as a text value rather than a blob literal. This leads to incorrect conversions or partial data recovery. Additionally, comparisons between BLOB
and TEXT
types fail even if their content is identical, complicating round-trip validation.
Absence of Native UNHEX Function and Data Type Mismatches
1. No Built-In UNHEX Function in Older SQLite Versions
SQLite’s HEX()
function converts input into a hexadecimal string, but prior to version 3.41, there was no native UNHEX()
function to reverse this process. The absence of this function forced users to rely on workarounds like recursive CTEs or custom extensions. The HEX()
function’s output is always a TEXT
value, which cannot be directly cast back to the original data type without knowing whether the source was TEXT
, INTEGER
, REAL
, or BLOB
.
2. Type Affinity and Implicit Conversions
SQLite uses dynamic typing and type affinity, which leads to unexpected behavior when reversing hex strings. For example:
- Numeric Data:
HEX(42)
returns the hex representation of the string ’42’ (ASCII values 0x34 and 0x32), not the integer 42. Reversing it withCAST(x'3432' AS INTEGER)
works only if the original data was stored as an integer. - Text and Blobs: Hex strings derived from
TEXT
may include multi-byte characters (e.g., ‘€’ in UTF-8), which require proper handling during conversion. Blobs, when converted to hex strings, lose their type information, making it impossible to distinguish them from text-derived hex strings without additional metadata.
3. Character Encoding and Binary Data Handling
Hex strings generated from BLOB
values or multi-byte text (e.g., UTF-8) require precise decoding. For instance, the Euro symbol ‘€’ in UTF-8 is represented as 0xE282AC
. Converting this back to text without specifying the encoding may result in corruption or placeholder characters (e.g., ‘?’ in some terminals). Similarly, binary data (e.g., RANDOMBLOB(16)
) converted to hex and back to text will produce unprintable characters, leading to mismatches in equality checks.
Implementing UNHEX Functionality and Correct Type Handling
1. Using Recursive CTE for Hex-to-Text Conversion (Pre-3.41)
For SQLite versions without UNHEX()
, a recursive CTE can decode hex strings into text. This method processes two hex characters at a time, converts them to their ASCII equivalents, and concatenates the results:
WITH RECURSIVE hex_decode(c, remaining) AS (
SELECT '', '686F77647921' -- Input hex string
UNION ALL
SELECT
c || CHAR(
(CASE SUBSTR(remaining, 1, 1)
WHEN 'A' THEN 10 WHEN 'B' THEN 11 WHEN 'C' THEN 12
WHEN 'D' THEN 13 WHEN 'E' THEN 14 WHEN 'F' THEN 15
ELSE CAST(SUBSTR(remaining, 1, 1) AS INTEGER) END) * 16 +
(CASE SUBSTR(remaining, 2, 1)
WHEN 'A' THEN 10 WHEN 'B' THEN 11 WHEN 'C' THEN 12
WHEN 'D' THEN 13 WHEN 'E' THEN 14 WHEN 'F' THEN 15
ELSE CAST(SUBSTR(remaining, 2, 1) AS INTEGER) END)
),
SUBSTR(remaining, 3)
FROM hex_decode
WHERE remaining != ''
)
SELECT c FROM hex_decode WHERE remaining = '';
Explanation:
- The CTE starts with an empty result string (
c
) and the input hex string. - Each iteration processes two hex characters, converts them to a decimal value, and appends the corresponding ASCII character to
c
. - The loop continues until the entire hex string is processed.
2. Custom UNHEX Extension for SQLite (C-Language)
For environments where extensions are feasible, a custom UNHEX()
function can be implemented. This function converts a hex string into a BLOB
:
#include <sqlite3ext.h>
SQLITE_EXTENSION_INIT1
static void unhexFunc(sqlite3_context *context, int argc, sqlite3_value **argv) {
const unsigned char *hex = sqlite3_value_text(argv[0]);
int len = sqlite3_value_bytes(argv[0]);
if (len % 2 != 0) { return; }
unsigned char *blob = sqlite3_malloc(len / 2);
for (int i = 0; i < len; i += 2) {
int high = (hex[i] >= 'A') ? (hex[i] - 'A' + 10) : (hex[i] - '0');
int low = (hex[i+1] >= 'A') ? (hex[i+1] - 'A' + 10) : (hex[i+1] - '0');
blob[i/2] = (high << 4) | low;
}
sqlite3_result_blob(context, blob, len/2, sqlite3_free);
}
int sqlite3_unhex_init(sqlite3 *db, char **pErr, const sqlite3_api_routines *pApi) {
SQLITE_EXTENSION_INIT2(pApi);
return sqlite3_create_function(db, "UNHEX", 1, SQLITE_UTF8, 0, unhexFunc, 0, 0);
}
Usage:
.load ./unhex_extension
SELECT UNHEX('686F77647921'); -- Returns 'howdy!' as BLOB
SELECT CAST(UNHEX('3432') AS INTEGER); -- Returns 42
3. Upgrading to SQLite 3.41+ for Native UNHEX Support
SQLite 3.41 introduced a built-in UNHEX()
function that converts hex strings to blobs. This eliminates the need for workarounds:
SELECT UNHEX('50726963652069732033E282AC'); -- Returns 'Price is 3€' as BLOB
SELECT CAST(UNHEX(HEX(42)) AS INTEGER); -- Returns 42
Important Notes:
UNHEX()
returns aBLOB
, so explicit casting is required when comparing againstTEXT
or numeric columns.- Always use
CAST(UNHEX(...) AS <type>)
to ensure compatibility with the original data type.
4. Handling BLOB vs TEXT Comparisons
When reversing hex strings, ensure that the result’s data type matches the original. For example:
-- Incorrect: Comparing BLOB (UNHEX result) with TEXT (original column)
SELECT COUNT(*) FROM employees WHERE lastname = UNHEX(HEX(lastname)); -- Returns 0
-- Correct: Cast UNHEX result to TEXT
SELECT COUNT(*) FROM employees WHERE lastname = CAST(UNHEX(HEX(lastname)) AS TEXT); -- Returns 8
5. Terminal and Encoding Configuration
When working with multi-byte characters (e.g., ‘€’), ensure the terminal or client uses UTF-8 encoding. In Windows Command Prompt:
chcp 65001 # Set code page to UTF-8
6. Debugging Hex Conversions
Use .mode
and .dump
in the SQLite CLI to inspect binary data:
.mode quote
SELECT HEX('€'), UNHEX(HEX('€')); -- Shows hex and raw bytes
By addressing data type affinity, using native or custom UNHEX()
functions, and ensuring proper encoding settings, users can reliably reverse hex strings in SQLite.