Unexpected MAX() Window Function Results Due to Range Preceding Edge Case
Unexpected MAX() Window Function Behavior with Specific Range Preceding Values
The core issue involves unexpected results from the MAX()
window function when applied to a dataset with a specific range-based window frame in SQLite. A user observed that a window defined with RANGE 365 PRECEDING
produced a lower value than a window defined with RANGE 364 PRECEDING
for the same row, despite the larger window logically including the same maximum value (1.0) present in the smaller window. This contradicts the expectation that expanding the window size should not reduce the computed maximum value unless the additional rows introduced by the larger window contain smaller values. In this case, all additional rows introduced by the larger window had a value of 0.0, which should not override the existing 1.0 maximum.
The dataset in question contains two columns:
c1
: A strictly monotonically increasing integer with gaps (e.g., 4, 5, 6, …, 544).c2
: A real number column containing 0.0 for all rows except one (c1=537
,c2=1.0
).
The query computes two windowed maximums over c2
:
MAX(c2) OVER (ORDER BY c1 RANGE 364 PRECEDING)
MAX(c2) OVER (ORDER BY c1 RANGE 365 PRECEDING)
For the row with c1=544
, the first window correctly returns 1.0, while the second returns 0.0. This anomaly occurs despite both windows including the row c1=537
where c2=1.0
. The root cause lies in how SQLite calculates the lower bound of the window frame for RANGE PRECEDING
clauses under specific edge conditions, leading to incorrect exclusion of the critical row containing c2=1.0
in the larger window.
Boundary Calculation Errors in RANGE-Based Window Frames
The unexpected behavior stems from a miscalculation in determining the lower bound of the window frame when using RANGE PRECEDING
with integer-ordered data. SQLite’s window function engine uses a binary search to identify the start of the window frame for RANGE
clauses. This search relies on the ORDER BY
column’s data type (integer in this case) and the range value to compute the lower bound. A bug in the boundary calculation caused the larger range (365 PRECEDING
) to exclude the row c1=537
when computing the window for c1=544
, even though 544 - 365 = 179
, and the row c1=537
(which is greater than 179) should logically be included.
Key technical details:
Range Preceding Logic:
For a row withc1=X
, the window frame includes all rows wherec1
is betweenX - N
andX
, whereN
is the range value. ForX=544
, the lower bound should be544 - 365 = 179
. All rows withc1 >= 179
andc1 <= 544
should be included. The rowc1=537
(which is within this range) containsc2=1.0
and should influence the maximum value.Binary Search Precision:
SQLite uses a binary search on theORDER BY
column (c1
) to find the first row greater than or equal to the lower bound. The bug caused the search to incorrectly identify the starting row, skippingc1=537
in the365 PRECEDING
window. This was due to an off-by-one error in the comparison logic when handling integer ranges.Data Type Handling:
Thec1
column is an integer, but the range calculation involves subtracting a value (365) from the current row’sc1
. If the subtraction operation is not precisely bounded, it can lead to incorrect comparisons. For example,544 - 365 = 179
, but due to integer division or truncation in intermediate steps, the actual lower bound might have been computed as180
, excludingc1=179
(if present) and shifting the window frame.Assertion Failures in Debug Builds:
The bug triggered an assertion failure in debug builds of SQLite, indicating an internal inconsistency in the window frame boundaries. In release builds (with assertions disabled), the miscalculation persisted silently, returning incorrect results.
Resolving Boundary Miscalculations in RANGE Windows
Step 1: Verify SQLite Version and Apply the Fix
The issue was patched in SQLite’s source code repository (see commit e042eb024738a83e). Users experiencing this problem should:
- Check their SQLite version using
SELECT sqlite_version();
. - Upgrade to a version containing the fix (versions released after August 29, 2024).
Step 2: Workarounds for Unpatched Versions
If upgrading is not immediately feasible, apply these workarounds:
Use ROWS Instead of RANGE:
ReplaceRANGE PRECEDING
withROWS PRECEDING
if the ordering column (c1
) has no duplicates and the goal is to include a fixed number of rows. For example:MAX(c2) OVER (ORDER BY c1 ROWS 365 PRECEDING)
This ensures the window includes exactly 365 preceding rows, avoiding boundary calculation issues.
Adjust the Range Value:
Increase the range slightly to ensure the critical row is included. For example, usingRANGE 366 PRECEDING
instead of365
might bypass the miscalculation.Materialize the Window Frame:
Explicitly define the window frame using a subquery to compute the lower and upper bounds:SELECT c1, c2, (SELECT MAX(c2) FROM weird WHERE c1 BETWEEN current.c1 - 364 AND current.c1) AS max_364, (SELECT MAX(c2) FROM weird WHERE c1 BETWEEN current.c1 - 365 AND current.c1) AS max_365 FROM weird AS current;
Step 3: Validate Window Frame Boundaries
To debug window frames, manually compute the expected bounds and verify which rows are included:
-- For the row with c1=544:
SELECT
MIN(c1) AS lower_bound,
MAX(c1) AS upper_bound
FROM weird
WHERE c1 BETWEEN 544 - 365 AND 544;
If the result does not include c1=537
, this indicates a deeper issue with data import or schema definitions.
Step 4: Audit Data Types and Indexes
Ensure the ORDER BY
column (c1
) is indexed and explicitly typed as INTEGER
. SQLite may treat untyped columns as BLOB
in some import scenarios, leading to unexpected comparisons. Recreate the table with strict typing:
CREATE TABLE weird (c1 INTEGER PRIMARY KEY, c2 REAL) STRICT;
Step 5: Test Edge Cases with Minimal Datasets
Reproduce the issue with a custom dataset to isolate boundary conditions:
INSERT INTO weird (c1, c2) VALUES
(179, 0.0),
(537, 1.0),
(544, 0.0);
Run the window function query to see if the maximum value is correctly computed for c1=544
.
Step 6: Monitor Debug Build Assertions
Compile SQLite with debugging enabled (-DSQLITE_DEBUG
) to catch assertion failures during query execution. This helps identify unresolved boundary calculation issues.
By addressing the boundary calculation logic, ensuring proper data typing, and leveraging workarounds, users can mitigate this issue until a patched SQLite version is deployed. The fix underscores the importance of rigorous testing for window function edge cases, particularly with integer-ordered data and range-based frames.