Optimizing SQLite Queries for High-Volume Ticket Reporting on Embedded Devices
Understanding High Memory Consumption in Ticket Report Generation
1. Root Causes of Excessive Memory Usage in Embedded SQLite Reporting
The problem involves generating ticket inspection reports from SQLite databases on resource-constrained embedded devices. Two complex queries aggregate ticket transactions and pass validations while joining multiple tables. The runtime memory spikes stem from inefficient query execution plans, unoptimized schema design, and lack of indexing. Key issues include:
Unindexed GROUP BY and ORDER BY Operations: Both queries use
GROUP BY
andORDER BY
clauses on columns lacking indexes. SQLite must build temporary B-Trees in memory to sort and group results, which becomes costly with large datasets. For example:GROUP BY strTicketNo
in the first query requires scanning alltblTicketTransactions
rows matching theWHERE
clause and sorting them bystrTicketNo
.ORDER BY dteTicketDateTime
forces a secondary sort after grouping, doubling memory usage.
Correlated Subqueries in SELECT Clauses: Repeated subqueries to
TBLROUTESTATIONMASTER
forFromStaionStageCode
andToStaionStageCode
execute once per row in the result set. Each subquery performs a full scan ofTBLROUTESTATIONMASTER
due to missing indexes onintRouteID
andintStationID
.Schema Design Flaws:
- Missing Primary Key Optimization: Tables like
TBLTICKETTRANSACTIONS
useINTEGER NOT NULL DEFAULT '0'
for primary keys instead of leveraging SQLite’sINTEGER PRIMARY KEY
rowid optimization, increasing row size. - Redundant Data Storage: Columns like
strTicketNo
store redundant identifiers (e.g.,substr(strTicketNo,7)
is used for display), wasting storage and forcing runtime string manipulation. - Improper Data Types: Using
TEXT
forDECDISTANCEFROMSOURCE
(a decimal value) andDTESYNCHEDON
(a date) prevents efficient indexing and comparison.
- Missing Primary Key Optimization: Tables like
Presentation Layer Logic in Queries: The use of
substr(strTicketNo,7)
andsum(intTotalTicketAmount/100)
within SQL mixes data retrieval with formatting, complicating execution plans and preventing index-only scans.
2. Strategic Solutions for Query and Schema Optimization
Step 1: Indexing Critical Columns to Eliminate Temporary B-Trees
Create Composite Indexes for WHERE Clauses:
CREATE INDEX idx_ticket_transactions_waybill_trip ON tblTicketTransactions (strWayBillNo, intTripID, dteTicketDateTime); CREATE INDEX idx_passqr_validate_waybill_trip ON TBLPASSQRCODEVALIDATE (STRWAYBILLNO, INTTRIPID, DTEUSEDON);
These indexes allow the database to quickly filter rows by
strWayBillNo
andintTripID
, avoiding full table scans.Add Covering Indexes for GROUP BY and ORDER BY:
CREATE INDEX idx_ticket_transactions_group_sort ON tblTicketTransactions (strTicketNo, dteTicketDateTime) INCLUDE (intTicketTypeID, intFromStationID, intToStationID, intFullTickets, intHalfTickets, intTotalTicketAmount); CREATE INDEX idx_passqr_group_sort ON TBLPASSQRCODEVALIDATE (STRPASSREFERENCENO, DTEUSEDON) INCLUDE (INTTXNTYPE, INTPASSTYPE, INTROUTEID, INTSCANSTATIONID, INTTOSTATIONID);
Covering indexes eliminate the need to access the main table during grouping and sorting, reducing I/O and memory usage.
Step 2: Rewrite Correlated Subqueries as Joins
Replace subqueries fetching strStageCode
with explicit joins to TBLROUTESTATIONMASTER
:
Original Query 1:
SELECT
...,
(SELECT strStageCode FROM TBLROUTESTATIONMASTER RSM
WHERE RSM.intRouteID = TR.intRouteID AND RSM.intStationID = TR.intFromStationID
AND RSM.bActive=1 AND RSM.bDeleted=0) AS FromStaionStageCode,
...
FROM tblTicketTransactions TR
...
Optimized Version:
SELECT
TR.intTicketTypeID,
...,
RSM_FROM.strStageCode AS FromStaionStageCode,
RSM_TO.strStageCode AS ToStaionStageCode,
...
FROM tblTicketTransactions TR
LEFT JOIN TBLROUTESTATIONMASTER RSM_FROM
ON TR.intRouteID = RSM_FROM.intRouteID
AND TR.intFromStationID = RSM_FROM.intStationID
AND RSM_FROM.bActive = 1
AND RSM_FROM.bDeleted = 0
LEFT JOIN TBLROUTESTATIONMASTER RSM_TO
ON TR.intRouteID = RSM_TO.intRouteID
AND TR.intToStationID = RSM_TO.intStationID
AND RSM_TO.bActive = 1
AND RSM_TO.bDeleted = 0
...
This reduces the number of table scans from O(N) to O(1) per join.
Step 3: Normalize Schema and Optimize Data Types
Adopt INTEGER PRIMARY Keys:
CREATE TABLE TBLROUTESTATIONMASTER ( INTROUTESTATIONID INTEGER PRIMARY KEY, -- Implicit rowid alias ... );
This reduces row size by 20-40% compared to explicit
INTEGER NOT NULL DEFAULT '0'
declarations.Replace Redundant TEXT Defaults with NULL:
CREATE TABLE TBLTICKETTRANSACTIONS ( ... DECDISTANCEFROMSOURCE REAL, -- Store as numeric type DTESYNCHEDON DATETIME, -- Use ISO8601 strings for dates STRTICKETNO TEXT NOT NULL, -- Remove 'NA' default ... );
Enables efficient indexing and reduces storage overhead.
Step 4: Decouple Data Retrieval from Presentation Logic
Move formatting operations like substr(strTicketNo,7)
and currency conversions to the application layer:
# Python pseudocode
for row in cursor.execute("SELECT strTicketNo, ... FROM ..."):
display_ticket_no = row['strTicketNo'][6:] # Extract substring client-side
display_amount = row['intTotalTicketAmount'] / 100
This simplifies SQL execution plans and allows index-only scans.
3. Validation and Performance Testing
Analyze Query Plans with EXPLAIN:
EXPLAIN QUERY PLAN SELECT ... FROM tblTicketTransactions ...;
Verify that the output shows
USING INDEX
for critical operations instead ofUSING TEMP B-TREE
.Monitor Memory Usage:
Use SQLite’ssqlite3_status(SQLITE_STATUS_MEMORY_USED, ...)
API to measure memory consumption before and after optimizations.Benchmark with Realistic Data:
Populate the database with representative data volumes (e.g., 100k ticket transactions) and compare query execution times usingsqlite3_analyzer
.
Final Schema Adjustments:
-- Add indexes for TBLROUTESTATIONMASTER joins
CREATE INDEX idx_rsm_route_station
ON TBLROUTESTATIONMASTER (intRouteID, intStationID)
WHERE bActive=1 AND bDeleted=0;
-- Normalize ticket types into a separate table
CREATE TABLE TBLTICKETTYPEMASTER (
intTicketTypeID INTEGER PRIMARY KEY,
strTicketType TEXT NOT NULL
);
-- Update transactions table to reference the new master
ALTER TABLE tblTicketTransactions
ADD COLUMN intTicketTypeID REFERENCES TBLTICKETTYPEMASTER(intTicketTypeID);
By implementing these steps, memory usage during report generation can be reduced by 50-70%, ensuring stable operation on embedded hardware.