UPDATE Using CTE Incorrectly Overwrites All Rows Instead of Shifting Values

The Relationship Between SELECT and UPDATE Behavior in SQLite When Shifting Column Values

Issue Overview: SELECT Query Correctly Shifts Values While UPDATE Overwrites All Rows

The core problem involves shifting values in a code column based on adjacent rows using Common Table Expressions (CTEs). A SELECT query successfully displays shifted values (where each row’s code should adopt the value from the prior indexInter). However, when converting this logic to an UPDATE, all affected rows receive the same value instead of the intended cascaded values.

Key Observations:

  1. Data Structure: The testUpdate table contains non-consecutive indexInter values with NULL gaps and empty strings in code.

  2. SELECT Logic:

    WITH fixes AS (SELECT indexInter, code FROM testUpdate WHERE indexInter > 2)
    SELECT 
      indexRow, 
      indexInter, 
      code, 
      (SELECT code FROM fixes WHERE fixes.indexInter + 1 = testUpdate.indexInter) AS edit 
    FROM testUpdate;
    

    This correctly displays edit as the code from the previous indexInter (e.g., indexInter=4 gets code from indexInter=3).

  3. UPDATE Logic:

    WITH fixes AS (SELECT indexInter, code FROM testUpdate WHERE indexInter > 2)
    UPDATE testUpdate
    SET code = (SELECT code FROM fixes WHERE fixes.indexInter + 1 = testUpdate.indexInter)
    WHERE indexInter > 3;
    

    This erroneously sets all updated code values to H1732 (the first matching value) instead of cascading them.

  4. Successful Alternative:

    WITH fixes AS (SELECT indexInter, code FROM testUpdate WHERE indexInter > 2)
    UPDATE testUpdate
    SET code = fixes.code
    FROM fixes
    WHERE fixes.indexInter + 1 = testUpdate.indexInter
      AND testUpdate.indexInter > 3;
    

    This works because it uses an explicit join (UPDATE ... FROM) rather than a correlated subquery.

Critical Insight: The discrepancy arises from how SQLite handles CTE materialization and query execution order during UPDATE operations. The non-materialized CTE in the failing query causes the subquery to reference live table data (including newly updated rows) during the UPDATE, creating a cascading overwrite effect. The working query materializes the CTE first, creating a static snapshot of the data before updates begin.


Possible Causes: CTE Materialization and Live Data Reference During Updates

1. CTE Materialization Behavior

SQLite’s WITH clause defaults to NOT MATERIALIZED for CTEs used once. This means the CTE is inlined as a subquery rather than cached. In the failing UPDATE, the non-materialized CTE becomes:

UPDATE testUpdate
SET code = (
  SELECT code 
  FROM (SELECT indexInter, code FROM testUpdate WHERE indexInter > 2) 
  WHERE indexInter + 1 = testUpdate.indexInter
)
WHERE indexInter > 3;

Problem: The subquery re-executes for each row during the UPDATE, referencing the current state of testUpdate, which includes previously updated rows. After the first update (indexInter=4 set to H1732), subsequent rows see this modified value, leading to a chain reaction.

2. Correlated Subquery Execution Order

In the failing query, the subquery is correlated—it depends on testUpdate.indexInter from the outer query. SQLite processes updates row-by-row, and each subquery execution uses the latest testUpdate data. This creates a ripple effect:

  • Row indexInter=4 updates to H1732 (from indexInter=3).
  • Row indexInter=5 now looks for indexInter=4, which was just set to H1732, so it updates to H1732.
  • This repeats for all subsequent rows.

3. Absence of Isolation Between Subquery and Main Update

Unlike transactional isolation between connections, a single UPDATE statement sees its own changes immediately. The subquery in the SET clause references the live table, not a snapshot. This breaks the intended shift operation, as each update contaminates the data source for subsequent rows.

4. Materialized CTE vs. UPDATE-FROM Join

The working query avoids this pitfall by:

  • Materializing the CTE (explicitly or implicitly via UPDATE ... FROM), creating a static dataset before updates.
  • Joining the CTE to testUpdate in a way that isolates the source data from ongoing updates.

Resolving the Issue: Ensuring Static Data References During Updates

Step 1: Force CTE Materialization

Explicitly materialize the CTE to create a snapshot of the data before updates:

WITH fixes AS MATERIALIZED (
  SELECT indexInter, code 
  FROM testUpdate 
  WHERE indexInter > 2
)
UPDATE testUpdate
SET code = (SELECT code FROM fixes WHERE fixes.indexInter + 1 = testUpdate.indexInter)
WHERE indexInter > 3;

Why This Works: The MATERIALIZED hint forces SQLite to evaluate and cache the CTE results before the UPDATE begins. The subquery now references a static dataset, preventing contamination from updated rows.

Step 2: Use UPDATE-FROM Syntax for Implicit Materialization

Rewrite the query to join the CTE with the target table:

WITH fixes AS (
  SELECT indexInter, code 
  FROM testUpdate 
  WHERE indexInter > 2
)
UPDATE testUpdate
SET code = fixes.code
FROM fixes
WHERE fixes.indexInter + 1 = testUpdate.indexInter
  AND testUpdate.indexInter > 3;

Advantages:

  • The FROM clause implicitly materializes the CTE.
  • The join condition isolates the source data from updates, as the CTE is evaluated once at the start.

Step 3: Avoid Correlated Subqueries on the Updated Table

If materialization isn’t feasible, restructure the query to avoid subqueries that reference the updated table. Use a self-join instead:

UPDATE testUpdate AS target
SET code = source.code
FROM testUpdate AS source
WHERE source.indexInter + 1 = target.indexInter
  AND target.indexInter > 3
  AND source.indexInter > 2;

Note: This works only if indexInter is unique or properly indexed to prevent ambiguous joins.

Step 4: Analyze Query Plans to Verify Materialization

Use EXPLAIN QUERY PLAN to check if the CTE is materialized:

EXPLAIN QUERY PLAN
WITH fixes AS MATERIALIZED (...)
UPDATE ...;

Look for MATERIALIZE fixes in the plan. If absent, the CTE is inlined, risking live data reference.

Step 5: Use Transactions for Data Safety

Wrap the update in a transaction and verify changes before committing:

BEGIN TRANSACTION;
-- Perform update
SELECT * FROM testUpdate WHERE indexInter > 3; -- Verify
COMMIT; -- or ROLLBACK;

Step 6: Indexing for Performance

Add an index on indexInter to speed up both the CTE and join operations:

CREATE INDEX idx_testUpdate_indexInter ON testUpdate(indexInter);

Step 7: Update Order Control

Force SQLite to process rows in descending indexInter order to avoid contamination:

WITH fixes AS MATERIALIZED (...)
UPDATE testUpdate
SET code = (...)
WHERE indexInter > 3
ORDER BY indexInter DESC;

Rationale: Updating higher indexInter values first prevents subqueries from referencing already-updated rows.


Summary of Fixes and Best Practices

  1. Materialize CTEs when they reference data modified in the same statement.
  2. Prefer UPDATE-FROM Joins over correlated subqueries for multi-row updates.
  3. Avoid Live Data References in subqueries during DML operations.
  4. Use Indexes to optimize CTE and join performance.
  5. Verify Query Plans to ensure intended materialization and execution order.
  6. Leverage Transactions for atomic testing of updates.

By understanding SQLite’s CTE materialization rules and update execution mechanics, developers can avoid unintended data corruption during shift operations or cascading updates.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *