Unexpected Non-NULL Results from LIMIT OFFSET with Large Offsets in SQLite

Understanding Subquery Behavior with Excessive OFFSET Values

Unexpected Non-NULL Returns in Subqueries with Large OFFSET

This issue occurs when a subquery using LIMIT 1 OFFSET N returns a non-NULL value even when N exceeds the number of available rows. The problem exhibits inconsistent behavior based on three critical factors:

Sort Direction: Queries with ORDER BY ... DESC return unexpected non-NULL values, while ORDER BY ... ASC returns NULL
Distinct Value Count: Adding distinct values to the dataset resolves the issue
Row Duplication: Duplicate values in the dataset trigger the anomalous behavior

Technical Manifestation
Given a table with 4 rows containing two distinct dates (2021-01-02 and 2021-01-03), the following query returns 2021-01-03 instead of NULL when using OFFSET 400:

SELECT (
  SELECT DISTINCT date(acqtime) 
  FROM series 
  ORDER BY date(acqtime) DESC 
  LIMIT 1 OFFSET 400
) AS subquery_result
FROM series;

Contradictory Behaviors

Changing DESC to ASC returns NULL
Adding a third distinct date (e.g., 2021-01-04) makes the subquery return NULL
Using non-distinct values (e.g., all identical dates) triggers the anomaly

Root Causes of Erroneous LIMIT-OFFSET Execution

1. Query Optimizer Short-Circuiting with Duplicate Values

SQLite’s query planner uses an optimization called "flattening" for subqueries with ORDER BY and LIMIT. When all rows have identical values after DISTINCT, the optimizer incorrectly assumes:

A single unique value exists
OFFSET becomes irrelevant (since there’s "only one row")
Returns the first available value regardless of OFFSET magnitude

This optimization bypasses the normal row-counting logic that would detect oversized offsets.

Proof of Concept

CREATE TABLE t1(x);
INSERT INTO t1 VALUES(1),(1),(1); -- Three identical values
SELECT (SELECT DISTINCT x FROM t1 ORDER BY +x LIMIT 1 OFFSET 100) FROM t1;
-- Returns three 1's instead of NULL

2. Sort Direction Impact on Index Selection

The DESC sort order triggers different index traversal logic compared to ASC. When duplicate values exist:

ASC sorts may use a B-tree index’s natural ascending order
DESC sorts might utilize an optimized backward scan
Backward scans interact unpredictably with DISTINCT and LIMIT clauses

This explains why changing sort direction alters the outcome.

3. DISTINCT + ORDER BY Execution Pipeline

SQLite processes DISTINCT before applying LIMIT/OFFSET. The sequence is:

Compute all distinct values
Sort them per ORDER BY
Apply LIMIT and OFFSET

When duplicates exist:

Step 1 reduces the dataset to fewer rows than OFFSET
The engine should return NULL but fails due to flawed row counting

Resolving LIMIT-OFFSET Anomalies and Workarounds

Step 1: Verify SQLite Version and Apply Patches

The bug was fixed in SQLite 3.37.2 (2022-01-12) via check-in 9282bcde301cee2a. Confirm your version:

SELECT sqlite_version();

If Unpatched:

Upgrade to 3.37.2+
For embedded systems: Recompile with the latest amalgamation

Step 2: Rewrite Queries to Bypass the Optimizer

Approach A: Prevent DISTINCT-ORDER BY Optimization
Add a no-op expression to disable short-circuiting:

SELECT (
  SELECT DISTINCT date(acqtime) || '' 
  FROM series 
  ORDER BY date(acqtime) DESC 
  LIMIT 1 OFFSET 400
) FROM series;

The || '' forces re-evaluation of distinctness.

Approach B: Use a Wrapper Subquery

SELECT (
  SELECT val FROM (
    SELECT DISTINCT date(acqtime) AS val 
    FROM series 
    ORDER BY val DESC
  ) LIMIT 1 OFFSET 400
) FROM series;

Nesting subqueries disables premature optimization.

Step 3: Implement Edge-Case Handling

Solution 1: Pre-Calculate Row Counts
Use a CTE to verify offset validity:

WITH dates AS (
  SELECT DISTINCT date(acqtime) AS d 
  FROM series 
  ORDER BY d DESC
), 
total AS (
  SELECT COUNT(*) AS cnt FROM dates
)
SELECT 
  CASE WHEN 400 < cnt THEN d ELSE NULL END 
FROM dates, total 
LIMIT 1 OFFSET 400;

Solution 2: COALESCE with Fallback

SELECT COALESCE(
  (SELECT ... LIMIT 1 OFFSET 400),
  (SELECT NULL WHERE 400 >= (SELECT COUNT(DISTINCT date(acqtime)) FROM series))
);

Step 4: Indexing Strategies to Avoid Duplicate Issues

Create a covering index to influence the optimizer’s behavior:

CREATE INDEX idx_series_acqtime_date 
ON series(date(acqtime) DESC, acqtime);

This forces the engine to:

Materialize computed dates
Process DISTINCT more accurately
Apply LIMIT/OFFSET correctly

Step 5: Permanent Schema Design Adjustments

Normalize Date Values
Create a separate dates table to eliminate duplicates:

CREATE TABLE acquisition_dates (
  date_id INTEGER PRIMARY KEY,
  acq_date TEXT UNIQUE
);

INSERT INTO acquisition_dates (acq_date)
SELECT DISTINCT date(acqtime) FROM series;

ALTER TABLE series ADD COLUMN date_id REFERENCES acquisition_dates(date_id);

Query using the normalized table:

SELECT (
  SELECT acq_date 
  FROM acquisition_dates 
  ORDER BY acq_date DESC 
  LIMIT 1 OFFSET 400
) FROM series;

Step 6: Regression Testing for LIMIT-OFFSET Logic

Implement unit tests using SQLite’s .test commands:

-- File: limit_offset.test
CREATE TABLE t1(x);
INSERT INTO t1 VALUES(1),(1),(1);

SELECT CASE WHEN (
  SELECT DISTINCT x FROM t1 ORDER BY x DESC LIMIT 1 OFFSET 100
) IS NULL THEN 'Pass' ELSE 'FAIL' END;

-- Expected output: Pass

Run tests via:

sqlite3 < limit_offset.test

Final Recommendation: Upgrade + Defensive Coding

Mandatory Upgrade: Apply SQLite 3.37.2+
Query Sanitization: Use COALESCE/CASE to handle NULL expectations
Index Optimization: Precompute frequently used expressions like date(acqtime)
Monitoring: Log queries with large OFFSET values for audit

This comprehensive approach addresses both the immediate bug and underlying design flaws that expose systems to similar edge cases.

Unexpected Non-NULL Results from LIMIT OFFSET with Large Offsets in SQLite

Understanding Subquery Behavior with Excessive OFFSET Values

Unexpected Non-NULL Returns in Subqueries with Large OFFSET

Root Causes of Erroneous LIMIT-OFFSET Execution

1. Query Optimizer Short-Circuiting with Duplicate Values

2. Sort Direction Impact on Index Selection

3. DISTINCT + ORDER BY Execution Pipeline

Resolving LIMIT-OFFSET Anomalies and Workarounds

Step 1: Verify SQLite Version and Apply Patches

Step 2: Rewrite Queries to Bypass the Optimizer

Step 3: Implement Edge-Case Handling

Step 4: Indexing Strategies to Avoid Duplicate Issues

Step 5: Permanent Schema Design Adjustments

Step 6: Regression Testing for LIMIT-OFFSET Logic

Final Recommendation: Upgrade + Defensive Coding

Handling Local Time in SQLite: UTC, UT1, and Timezone Conversions

Calculating Weekly Totals in SQLite Using Julian Day Manipulation

Resolving SQLite Query Issues with Date Formatting, Subject Renaming, and Pass/Fail Results

SQLite Boolean Conversion and Comparison Semantics

FTS5 Snippet Token Limit Exceeds Documented Maximum in Trigram Use Case

Inconsistent Query Results Due to RTREE Floating-Point Precision Limitations

Leave a Reply Cancel reply

Understanding Subquery Behavior with Excessive OFFSET Values

Unexpected Non-NULL Returns in Subqueries with Large OFFSET

Root Causes of Erroneous LIMIT-OFFSET Execution

1. Query Optimizer Short-Circuiting with Duplicate Values

2. Sort Direction Impact on Index Selection

3. DISTINCT + ORDER BY Execution Pipeline

Resolving LIMIT-OFFSET Anomalies and Workarounds

Step 1: Verify SQLite Version and Apply Patches

Step 2: Rewrite Queries to Bypass the Optimizer

Step 3: Implement Edge-Case Handling

Step 4: Indexing Strategies to Avoid Duplicate Issues

Step 5: Permanent Schema Design Adjustments

Step 6: Regression Testing for LIMIT-OFFSET Logic

Final Recommendation: Upgrade + Defensive Coding

Related Guides

Leave a Reply Cancel reply