Updating Multiple Rows via JSON Array in SQLite: Handling NULL Values
Issue Overview: JSON-Based Bulk Update Producing NULL Values Instead of Expected Numbers
The core challenge involves attempting to update multiple rows in an SQLite table using values extracted from a JSON array. The table t
contains columns f_id
(foreign key), t_id
(primary key), and amount
(numeric value). A JSON object contains an array of objects under $.obj.arr
, each specifying t_id
and a new amount
. The goal is to update the amount
column in all rows matching the t_id
values in the JSON array.
The original approach used nested json_extract()
and json_each()
calls to parse the JSON structure and correlate t_id
values between the table and JSON data. However, this resulted in all amount
values being set to NULL
instead of the expected numbers. Subsequent analysis revealed that intermediate steps in the JSON parsing logic inadvertently stripped key names from the JSON objects, making it impossible to reference $.amount
correctly.
This issue highlights three critical aspects of SQLite’s JSON functions:
- The behavior of
json_extract()
when extracting multiple fields. - The structure of data returned by
json_each()
in nested subqueries. - The importance of preserving JSON object key names during extraction to enable subsequent field access.
Possible Causes: Misalignment Between JSON Parsing Logic and SQLite Function Semantics
Cause 1: Incorrect Use of json_extract()
for Multiple Field Extraction
The original query used json_extract(a.value, '$.amount', '$.t_id') as arr
, assuming this would return a JSON object with both fields. However, json_extract()
returns a concatenated array of values without keys when multiple paths are specified. For example, extracting $.amount
and $.t_id
from {"t_id":1,"amount":444}
returns [444,1]
instead of {"amount":444,"t_id":1}
. This stripped the key names, making it impossible to reference $.amount
in later steps.
Cause 2: Loss of JSON Object Structure in Subquery Aliasing
The subquery json_each(arr) a
in the original statement operated on the array [444,1]
(from Cause 1), which lacks key-value pairs. The value
column in json_each()
for such an array contains scalar values (e.g., 444
, 1
), not JSON objects. Consequently, json_extract(a.value, '$.amount')
had no amount
key to extract, resulting in NULL
.
Cause 3: Absence of Correlation Between Subquery and Outer Table
The WHERE t_id = json_extract(arr, '$[1]')
clause attempted to match the table’s t_id
with the second element of the extracted array (1
or 2
). However, because the subquery was not explicitly correlated to the outer UPDATE
statement’s current row, this condition either matched incorrectly or not at all, leading to unintended NULL assignments.
Troubleshooting Steps, Solutions & Fixes: Correctly Mapping JSON Data to Table Updates
Step 1: Preserve JSON Object Structure During Initial Extraction
Instead of extracting multiple fields into an array, extract the entire JSON object for each array element. Modify the subquery to use json_extract(a.value, '$')
to retain key-value pairs:
SELECT json_extract(a.value, '$') AS arr
FROM json_each(
'{"f_id":1,"obj":{"arr":[{"t_id":1,"amount":444},{"t_id":2,"amount":443}]}}',
'$.obj.arr'
) AS a
This returns JSON objects like {"t_id":1,"amount":444}
in the arr
column, preserving key names.
Step 2: Use json_each()
on Full JSON Objects, Not Scalar Arrays
With the full JSON object preserved, apply json_each()
to the arr
column to iterate over its key-value pairs:
SELECT json_extract(a.value, '$.amount') AS amount
FROM (
SELECT json_extract(a.value, '$') AS arr
FROM json_each(
'{"f_id":1,"obj":{"arr":[{"t_id":1,"amount":444},{"t_id":2,"amount":443}]}}',
'$.obj.arr'
) AS a
) q, json_each(q.arr) AS a
WHERE json_extract(q.arr, '$.t_id') = t.t_id
Here, json_each(q.arr)
iterates over the keys (t_id
, amount
) of the JSON object. The WHERE
clause correlates the subquery with the outer table’s t_id
.
Step 3: Simplify the Update Logic by Directly Joining JSON Data
A more efficient approach avoids nested subqueries by directly joining the table with the JSON array:
UPDATE t
SET amount = (
SELECT json_extract(a.value, '$.amount')
FROM json_each(
'{"f_id":1,"obj":{"arr":[{"t_id":1,"amount":444},{"t_id":2,"amount":443}]}}',
'$.obj.arr'
) AS a
WHERE json_extract(a.value, '$.t_id') = t.t_id
)
WHERE f_id = json_extract(
'{"f_id":1,"obj":{"arr":[{"t_id":1,"amount":444},{"t_id":2,"amount":443}]}}',
'$.f_id'
);
Key Improvements:
- The subquery in
SET amount = (...)
directly correlatest.t_id
withjson_extract(a.value, '$.t_id')
. json_each()
iterates over the$.obj.arr
array, with eacha.value
being a JSON object ({"t_id":1,"amount":444}
).- The outer
WHERE
clause restricts updates to rows matching the JSON’sf_id
.
Step 4: Validate JSON Paths and Intermediate Results
Use SELECT
statements to test each component of the query:
-- Verify JSON array extraction
SELECT json_extract(a.value, '$') AS obj
FROM json_each(
'{"f_id":1,"obj":{"arr":[{"t_id":1,"amount":444},{"t_id":2,"amount":443}]}}',
'$.obj.arr'
) AS a;
-- Check field extraction from JSON objects
SELECT
json_extract(a.value, '$.t_id') AS t_id,
json_extract(a.value, '$.amount') AS amount
FROM json_each(
'{"f_id":1,"obj":{"arr":[{"t_id":1,"amount":444},{"t_id":2,"amount":443}]}}',
'$.obj.arr'
) AS a;
This ensures that t_id
and amount
are correctly parsed before integrating them into the UPDATE
statement.
Step 5: Handle Edge Cases and Performance Considerations
- Missing
t_id
in JSON: Rows with no matchingt_id
in the JSON array will setamount
toNULL
. UseCOALESCE(amount, t.amount)
to retain existing values if needed. - Duplicate
t_id
in JSON: The subquery returns the first matchingamount
. UseLIMIT 1
or aggregate functions if duplicates are possible. - Indexing: Ensure
t.t_id
is indexed for faster correlation in large datasets.
Final Solution: Optimized and Robust Update Statement
UPDATE t
SET amount = (
SELECT json_extract(a.value, '$.amount')
FROM json_each(
'{"f_id":1,"obj":{"arr":[{"t_id":1,"amount":444},{"t_id":2,"amount":443}]}}',
'$.obj.arr'
) AS a
WHERE json_extract(a.value, '$.t_id') = t.t_id
)
WHERE EXISTS (
SELECT 1
FROM json_each(
'{"f_id":1,"obj":{"arr":[{"t_id":1,"amount":444},{"t_id":2,"amount":443}]}}',
'$.obj.arr'
) AS a
WHERE json_extract(a.value, '$.t_id') = t.t_id
)
AND f_id = json_extract(
'{"f_id":1,"obj":{"arr":[{"t_id":1,"amount":444},{"t_id":2,"amount":443}]}}',
'$.f_id'
);
Explanation:
- The
EXISTS
clause ensures only rows with a matchingt_id
in the JSON array are updated. - Direct correlation in the
SET
subquery avoids unnecessary joins. - Filtering by
f_id
restricts updates to the correct foreign key scope.
By systematically addressing the structural missteps in JSON parsing and aligning the subqueries with SQLite’s function semantics, this approach guarantees accurate bulk updates from JSON data while avoiding NULL assignment pitfalls.