Slow SQLite Query Performance Due to Missing Automatic Covering Index
SQLite Query Performance Degradation in Combined Query Execution
The core issue revolves around a significant performance degradation when executing a combined SQLite query compared to running the same logic as two separate queries. The combined query takes approximately 0.13 seconds to execute, whereas splitting the query into two parts reduces the execution time to around 0.007 seconds (0.005s + 0.002s). This discrepancy is primarily due to the absence of an automatic covering index in the combined query, which is otherwise utilized in the separated queries. The query involves calculating cumulative charging durations from a given percentage to 100%, using a recursive common table expression (CTE) and multiple joins.
The table schema is as follows:
CREATE TABLE ChargingRates (
Age INTEGER NOT NULL,
Type INTEGER NOT NULL,
FromPercent REAL NOT NULL,
ToPercent REAL NOT NULL,
Duration INTEGER NOT NULL,
Recorded TEXT DEFAULT (strftime('%Y-%m-%d %H:%M:%f', 'now', 'localtime'))
);
The query logic involves two levels of data aggregation: Level0, which calculates the average duration across all recorded data, and Level1, which calculates the duration for specific percentage ranges. The combined query attempts to perform these calculations in a single execution, but the lack of an automatic covering index leads to inefficient execution plans and slower performance.
Interrupted Automatic Covering Index Creation in Recursive CTE
The primary cause of the performance degradation is the failure of SQLite to create an automatic covering index for the combined query. In the separated queries, SQLite successfully creates an automatic covering index, which significantly speeds up the execution. However, in the combined query, this optimization is not applied, leading to full table scans and inefficient joins.
The query plan for the combined query reveals several inefficiencies:
- The
Percentages
CTE is scanned repeatedly. - The
ChargingRates
table is scanned multiple times, even after adding recommended indexes. - The
Durations
CTE is scanned without leveraging an index, leading to increased computational overhead.
The separated queries, on the other hand, benefit from the automatic covering index, as evidenced by the query plan:
- The
Percentages
CTE is scanned only once. - The
ChargingRates
table is searched using the created indexes (ChargingRates_idx_99a2146b
andChargingRates_idx_000119d2
). - The
Durations
CTE leverages the automatic covering index, reducing the computational load.
The absence of the automatic covering index in the combined query forces SQLite to perform redundant operations, such as repeated scans and inefficient joins, which are avoided in the separated queries.
Implementing Index Optimization and Query Refactoring for Performance Improvement
To address the performance degradation, the following steps can be taken:
Step 1: Analyze and Optimize Indexes
The first step is to ensure that the necessary indexes are in place. The recommended indexes from the .expert
command are:
CREATE INDEX ChargingRates_idx_000119d2 ON ChargingRates(Type);
CREATE INDEX ChargingRates_idx_99a2146b ON ChargingRates(Type, FromPercent, ToPercent);
These indexes help optimize the search operations on the ChargingRates
table. However, they do not fully address the issue in the combined query. To further optimize, consider creating a composite index that covers all the columns used in the joins and filters:
CREATE INDEX ChargingRates_idx_optimized ON ChargingRates(Type, FromPercent, ToPercent, Duration);
This index ensures that all the necessary columns are covered, reducing the need for additional table scans.
Step 2: Refactor the Combined Query
The combined query can be refactored to reduce redundancy and improve efficiency. One approach is to break down the query into smaller, more manageable parts and use temporary tables to store intermediate results. This approach mimics the separated queries but maintains the logical flow of the combined query.
Refactored Query:
-- Step 1: Calculate Level1 and Level0 durations
WITH Percentages(P) AS (
SELECT .31
UNION ALL
SELECT ROUND(P + .01, 2) FROM Percentages WHERE P < 1.01
),
Level1Durations AS (
SELECT
P AS FromP,
CASE WHEN ToPercent IS NOT NULL
THEN ToPercent
ELSE ROUND(P + .01, 2)
END AS ToP,
CASE WHEN ChargingRatesLevel1.Duration IS NOT NULL
THEN (ToPercent - P) / (ToPercent - FromPercent) * ChargingRatesLevel1.Duration
ELSE ChargingRatesLevel0.Duration
END AS Duration
FROM Percentages
LEFT JOIN (
SELECT 1 AS Level, FromPercent AS FromPercent, ToPercent AS ToPercent, AVG(Duration) AS Duration
FROM ChargingRates
WHERE Type = 1
GROUP BY FromPercent, ToPercent
) ChargingRatesLevel1 ON (ChargingRatesLevel1.FromPercent <= P AND ChargingRatesLevel1.ToPercent > P)
LEFT JOIN (
SELECT 0 AS Level, AVG(Duration) AS Duration
FROM ChargingRates
WHERE Type = 1
) ChargingRatesLevel0
WHERE FromP IS NULL OR ToPercent IS NULL OR FromP <> ToPercent
),
-- Step 2: Store intermediate results in a temporary table
TempDurations AS (
SELECT * FROM Level1Durations
),
-- Step 3: Calculate cumulative durations
CumulativeDurations AS (
SELECT .31 AS FromP, .31 AS ToP, 0 AS Seconds
UNION ALL
SELECT CumulativeDurations.FromP, TempDurations.ToP, TempDurations.Duration + CumulativeDurations.Seconds
FROM TempDurations
INNER JOIN CumulativeDurations ON (CumulativeDurations.ToP = TempDurations.FromP)
)
-- Step 4: Select the final results
SELECT * FROM CumulativeDurations WHERE ToP <= 1 ORDER BY ToP ASC;
This refactored query breaks down the logic into smaller, more efficient parts and uses temporary tables to store intermediate results. This approach reduces the computational overhead and allows SQLite to leverage indexes more effectively.
Step 3: Analyze Query Performance
After implementing the above changes, analyze the query performance using the EXPLAIN QUERY PLAN
command. Ensure that the query plan shows efficient index usage and minimal table scans. If necessary, further refine the indexes or query structure to optimize performance.
Step 4: Monitor and Adjust
Finally, monitor the query performance over time and adjust the indexes or query structure as needed. SQLite’s query optimizer may benefit from periodic updates to the statistics using the ANALYZE
command, which helps the optimizer make better decisions about index usage.
By following these steps, the performance of the combined query can be significantly improved, bringing it closer to the efficiency of the separated queries. The key is to ensure that the necessary indexes are in place and that the query structure allows SQLite to leverage these indexes effectively.