Optimizing SQLite Queries for High-Volume Ticket Reporting on Embedded Devices
Understanding High Memory Consumption in Ticket Report Generation
1. Root Causes of Excessive Memory Usage in Embedded SQLite Reporting
The problem involves generating ticket inspection reports from SQLite databases on resource-constrained embedded devices. Two complex queries aggregate ticket transactions and pass validations while joining multiple tables. The runtime memory spikes stem from inefficient query execution plans, unoptimized schema design, and lack of indexing. Key issues include:
-
Unindexed GROUP BY and ORDER BY Operations: Both queries use
GROUP BYandORDER BYclauses on columns lacking indexes. SQLite must build temporary B-Trees in memory to sort and group results, which becomes costly with large datasets. For example:GROUP BY strTicketNoin the first query requires scanning alltblTicketTransactionsrows matching theWHEREclause and sorting them bystrTicketNo.ORDER BY dteTicketDateTimeforces a secondary sort after grouping, doubling memory usage.
-
Correlated Subqueries in SELECT Clauses: Repeated subqueries to
TBLROUTESTATIONMASTERforFromStaionStageCodeandToStaionStageCodeexecute once per row in the result set. Each subquery performs a full scan ofTBLROUTESTATIONMASTERdue to missing indexes onintRouteIDandintStationID. -
Schema Design Flaws:
- Missing Primary Key Optimization: Tables like
TBLTICKETTRANSACTIONSuseINTEGER NOT NULL DEFAULT '0'for primary keys instead of leveraging SQLite’sINTEGER PRIMARY KEYrowid optimization, increasing row size. - Redundant Data Storage: Columns like
strTicketNostore redundant identifiers (e.g.,substr(strTicketNo,7)is used for display), wasting storage and forcing runtime string manipulation. - Improper Data Types: Using
TEXTforDECDISTANCEFROMSOURCE(a decimal value) andDTESYNCHEDON(a date) prevents efficient indexing and comparison.
- Missing Primary Key Optimization: Tables like
-
Presentation Layer Logic in Queries: The use of
substr(strTicketNo,7)andsum(intTotalTicketAmount/100)within SQL mixes data retrieval with formatting, complicating execution plans and preventing index-only scans.
2. Strategic Solutions for Query and Schema Optimization
Step 1: Indexing Critical Columns to Eliminate Temporary B-Trees
-
Create Composite Indexes for WHERE Clauses:
CREATE INDEX idx_ticket_transactions_waybill_trip ON tblTicketTransactions (strWayBillNo, intTripID, dteTicketDateTime); CREATE INDEX idx_passqr_validate_waybill_trip ON TBLPASSQRCODEVALIDATE (STRWAYBILLNO, INTTRIPID, DTEUSEDON);These indexes allow the database to quickly filter rows by
strWayBillNoandintTripID, avoiding full table scans. -
Add Covering Indexes for GROUP BY and ORDER BY:
CREATE INDEX idx_ticket_transactions_group_sort ON tblTicketTransactions (strTicketNo, dteTicketDateTime) INCLUDE (intTicketTypeID, intFromStationID, intToStationID, intFullTickets, intHalfTickets, intTotalTicketAmount); CREATE INDEX idx_passqr_group_sort ON TBLPASSQRCODEVALIDATE (STRPASSREFERENCENO, DTEUSEDON) INCLUDE (INTTXNTYPE, INTPASSTYPE, INTROUTEID, INTSCANSTATIONID, INTTOSTATIONID);Covering indexes eliminate the need to access the main table during grouping and sorting, reducing I/O and memory usage.
Step 2: Rewrite Correlated Subqueries as Joins
Replace subqueries fetching strStageCode with explicit joins to TBLROUTESTATIONMASTER:
Original Query 1:
SELECT
...,
(SELECT strStageCode FROM TBLROUTESTATIONMASTER RSM
WHERE RSM.intRouteID = TR.intRouteID AND RSM.intStationID = TR.intFromStationID
AND RSM.bActive=1 AND RSM.bDeleted=0) AS FromStaionStageCode,
...
FROM tblTicketTransactions TR
...
Optimized Version:
SELECT
TR.intTicketTypeID,
...,
RSM_FROM.strStageCode AS FromStaionStageCode,
RSM_TO.strStageCode AS ToStaionStageCode,
...
FROM tblTicketTransactions TR
LEFT JOIN TBLROUTESTATIONMASTER RSM_FROM
ON TR.intRouteID = RSM_FROM.intRouteID
AND TR.intFromStationID = RSM_FROM.intStationID
AND RSM_FROM.bActive = 1
AND RSM_FROM.bDeleted = 0
LEFT JOIN TBLROUTESTATIONMASTER RSM_TO
ON TR.intRouteID = RSM_TO.intRouteID
AND TR.intToStationID = RSM_TO.intStationID
AND RSM_TO.bActive = 1
AND RSM_TO.bDeleted = 0
...
This reduces the number of table scans from O(N) to O(1) per join.
Step 3: Normalize Schema and Optimize Data Types
-
Adopt INTEGER PRIMARY Keys:
CREATE TABLE TBLROUTESTATIONMASTER ( INTROUTESTATIONID INTEGER PRIMARY KEY, -- Implicit rowid alias ... );This reduces row size by 20-40% compared to explicit
INTEGER NOT NULL DEFAULT '0'declarations. -
Replace Redundant TEXT Defaults with NULL:
CREATE TABLE TBLTICKETTRANSACTIONS ( ... DECDISTANCEFROMSOURCE REAL, -- Store as numeric type DTESYNCHEDON DATETIME, -- Use ISO8601 strings for dates STRTICKETNO TEXT NOT NULL, -- Remove 'NA' default ... );Enables efficient indexing and reduces storage overhead.
Step 4: Decouple Data Retrieval from Presentation Logic
Move formatting operations like substr(strTicketNo,7) and currency conversions to the application layer:
# Python pseudocode
for row in cursor.execute("SELECT strTicketNo, ... FROM ..."):
display_ticket_no = row['strTicketNo'][6:] # Extract substring client-side
display_amount = row['intTotalTicketAmount'] / 100
This simplifies SQL execution plans and allows index-only scans.
3. Validation and Performance Testing
-
Analyze Query Plans with EXPLAIN:
EXPLAIN QUERY PLAN SELECT ... FROM tblTicketTransactions ...;Verify that the output shows
USING INDEXfor critical operations instead ofUSING TEMP B-TREE. -
Monitor Memory Usage:
Use SQLite’ssqlite3_status(SQLITE_STATUS_MEMORY_USED, ...)API to measure memory consumption before and after optimizations. -
Benchmark with Realistic Data:
Populate the database with representative data volumes (e.g., 100k ticket transactions) and compare query execution times usingsqlite3_analyzer.
Final Schema Adjustments:
-- Add indexes for TBLROUTESTATIONMASTER joins
CREATE INDEX idx_rsm_route_station
ON TBLROUTESTATIONMASTER (intRouteID, intStationID)
WHERE bActive=1 AND bDeleted=0;
-- Normalize ticket types into a separate table
CREATE TABLE TBLTICKETTYPEMASTER (
intTicketTypeID INTEGER PRIMARY KEY,
strTicketType TEXT NOT NULL
);
-- Update transactions table to reference the new master
ALTER TABLE tblTicketTransactions
ADD COLUMN intTicketTypeID REFERENCES TBLTICKETTYPEMASTER(intTicketTypeID);
By implementing these steps, memory usage during report generation can be reduced by 50-70%, ensuring stable operation on embedded hardware.