FTS5 Rows Disappear When ORDER BY rowid DESC Applied: Analysis and Fixes
Issue Overview: FTS5 Rows Missing with ORDER BY rowid DESC
The core issue revolves around the unexpected behavior of SQLite’s FTS5 (Full-Text Search) virtual table when executing queries involving ORDER BY rowid DESC
. Specifically, rows that are successfully returned with ORDER BY rowid ASC
or without any ordering clause disappear when the same query is modified to use ORDER BY rowid DESC
. This issue was observed in a large FTS5 table where rowids are constructed as 64-bit integers combining a Unix timestamp and a hash value, ensuring uniqueness and temporal ordering.
The problem manifests in queries like:
SELECT rowid FROM message_fts WHERE from_ MATCH 'search_term' AND rowid < 0x7fffffffffffffff ORDER BY rowid DESC;
While the equivalent query without ORDER BY
or with ORDER BY rowid ASC
works as expected, the DESC
ordering causes the query to return no results. This behavior is particularly puzzling because FTS5 is designed to handle rowid-based ordering efficiently, and the rowids in question are well within the valid range for SQLite’s 64-bit integers.
The issue was initially reported in SQLite version 3.40.1 and was later found to be resolved in version 3.42.0. However, the root cause appears to be related to an optimization bug in the query planner, which was inadvertently fixed by changes introduced for the secure-delete feature. This suggests that the problem might be deeper than a simple ordering issue and could involve how FTS5 interacts with SQLite’s internal query optimization mechanisms.
Possible Causes: Optimization Bugs and Rowid Handling in FTS5
The disappearance of rows when applying ORDER BY rowid DESC
in FTS5 queries can be attributed to several potential causes, each rooted in the interplay between FTS5’s internal mechanisms and SQLite’s query optimization strategies.
1. Query Planner Optimization Bug:
The most likely cause is a bug in SQLite’s query planner when optimizing FTS5 queries with descending rowid ordering. The query planner is responsible for determining the most efficient way to execute a query, including choosing indexes, join strategies, and sorting mechanisms. In this case, the planner might incorrectly optimize the query when ORDER BY rowid DESC
is applied, leading to an execution plan that skips valid rows or fails to traverse the index correctly.
This hypothesis is supported by the observation that wrapping the SELECT rowid
query in an aggregate function like COUNT(rowid)
or GROUP_CONCAT(rowid)
"fixes" the issue. Aggregate functions often force the query planner to adopt a different execution strategy, bypassing the problematic optimization path.
2. Rowid Encoding and Range Handling:
The rowids in the affected FTS5 table are constructed using a custom encoding scheme:
((unix timestamp) << 31) | ((hash) & 0x7fffffff)
This encoding ensures that rowids are unique and sortable by time. However, the use of bitwise operations and large 64-bit integers might expose edge cases in SQLite’s handling of rowids, particularly when dealing with descending order. SQLite’s internal mechanisms for sorting and indexing might not handle such custom-encoded rowids correctly, especially when combined with FTS5’s unique indexing strategy.
3. FTS5 Index Traversal Issues:
FTS5 maintains its own internal index structures to support full-text search operations. When a query includes an ORDER BY
clause, SQLite must traverse these indexes in the specified order. The issue might arise from a bug in how FTS5’s index traversal handles descending order, causing it to skip or miss rows that should be included in the result set.
4. Version-Specific Bugs:
The problem was observed in SQLite version 3.40.1 and resolved in version 3.42.0. This suggests that the issue might be related to version-specific bugs or regressions. The fix coinciding with the introduction of the secure-delete feature further complicates the matter, as it implies that changes unrelated to FTS5 or rowid handling inadvertently addressed the problem.
Troubleshooting Steps, Solutions & Fixes: Diagnosing and Resolving the Issue
To diagnose and resolve the issue of disappearing rows in FTS5 queries with ORDER BY rowid DESC
, follow these detailed steps:
1. Verify SQLite Version:
The first step is to ensure that you are using a version of SQLite that includes the fix for this issue. As reported, the problem was resolved in SQLite 3.42.0. If you are using an older version, upgrading to 3.42.0 or later should resolve the issue. You can check your SQLite version by running:
sqlite3 --version
If an upgrade is not immediately feasible, consider testing your queries on a system with the updated version to confirm that the issue is resolved.
2. Analyze Query Execution Plans:
To understand how the query planner is handling the ORDER BY rowid DESC
clause, use the EXPLAIN QUERY PLAN
statement. This will provide insights into the execution plan chosen by SQLite and help identify any anomalies. For example:
EXPLAIN QUERY PLAN
SELECT rowid FROM message_fts WHERE from_ MATCH 'search_term' AND rowid < 0x7fffffffffffffff ORDER BY rowid DESC;
Compare the output with the execution plan for the working query (without ORDER BY
or with ORDER BY rowid ASC
). Look for differences in index usage, sorting strategies, or traversal methods that might explain the missing rows.
3. Test with Aggregate Functions:
As noted in the discussion, wrapping the SELECT rowid
query in an aggregate function like COUNT(rowid)
or GROUP_CONCAT(rowid)
can bypass the problematic optimization. This workaround can be useful for diagnosing the issue and providing a temporary fix. For example:
SELECT COUNT(rowid) FROM message_fts WHERE from_ MATCH 'search_term' AND rowid < 0x7fffffffffffffff ORDER BY rowid DESC;
If this query returns the expected results, it further supports the hypothesis that the issue lies in the query planner’s optimization for non-aggregate queries.
4. Simplify the Query:
To isolate the issue, simplify the query as much as possible. Remove unnecessary conditions and clauses to determine the minimal set of operations that trigger the problem. For example:
SELECT rowid FROM message_fts ORDER BY rowid DESC;
If this simplified query also fails to return results, the issue is likely related to FTS5’s handling of descending rowid ordering rather than the specific conditions in the original query.
5. Check for Custom Rowid Encoding Issues:
Given the custom encoding scheme used for rowids, test whether the issue persists with simpler rowid values. Create a new FTS5 table with standard integer rowids and run similar queries to see if the problem occurs. This will help determine whether the custom encoding is a contributing factor.
6. Review FTS5 Configuration:
Ensure that the FTS5 table is configured correctly and that no unusual settings or options are affecting its behavior. For example, check the tokenizer, contentless table configuration, and any custom options that might influence indexing or query execution.
7. Provide a Minimal Reproduction:
If the issue persists and you need further assistance, create a minimal reproduction of the problem. This involves creating a small, self-contained database that exhibits the issue and sharing it with the SQLite development team or the community. A minimal reproduction should include:
- The schema of the FTS5 table.
- A small set of sample data.
- The exact queries that trigger the issue.
8. Monitor for Future Updates:
Even if the issue is resolved in SQLite 3.42.0, continue to monitor for updates and patches. SQLite is actively maintained, and new versions may include additional fixes or improvements related to FTS5 and rowid handling.
9. Consider Alternative Approaches:
If upgrading SQLite or applying the above fixes is not feasible, consider alternative approaches to achieve the desired query results. For example:
- Use a combination of
ORDER BY rowid ASC
and application-level logic to reverse the order of results. - Implement custom indexing or sorting mechanisms outside of FTS5.
- Explore other full-text search solutions that might better suit your requirements.
By following these steps, you can diagnose and resolve the issue of disappearing rows in FTS5 queries with ORDER BY rowid DESC
. The key is to systematically isolate the problem, test potential fixes, and leverage the SQLite community’s expertise when needed.