Database File Reverts to 20KB After Application Restart: Transaction Commit Issues
Transaction Commit Failures in High-Volume Data Insertion Workflows
Issue Overview
The core problem involves an SQLite database file reverting to a 20KB size after an application restart, despite apparent successful insertion of millions of rows during runtime. The application uses .NET 7 with the System.Data.SQLite
library, transactions, and prepared statements to handle bulk operations (e.g., 238 million rows). After the application terminates and restarts, the database file loses all inserted data, shrinking to its initial size. Key observations include:
- The database grows to 6–7GB during runtime but collapses to 20KB upon restart.
- Transactions are explicitly managed (
BeginTransaction
,Commit
,Dispose
). - Code paths involve FTP data ingestion, batch updates, and connection pooling.
- The issue persists across storage devices (NVMe, SSD).
The behavior suggests that data is not being permanently written to the database file. SQLite relies on transaction commits to finalize changes. If transactions are not properly committed, all modifications are rolled back when the connection closes. The 20KB file represents an empty SQLite schema (tables defined but no data).
Root Causes of Uncommitted Transactions
1. Implicit Rollback Due to Missing Transaction Commit
SQLite requires explicit commits to persist changes. If a transaction is opened but not committed before the connection closes, all changes are discarded. In the provided code:
- A transaction (
trans = updcmd.Connection.BeginTransaction()
) is initiated inside a loop processing FTP data. - The transaction is committed only after processing a batch of lines.
- Critical flaw: Exceptions or early exits (e.g.,
break
,return
) beforetrans.Commit()
leave the transaction uncommitted. For example, if an error occurs duringreadStream.ReadBlockAsync
, thecatch
block attempts to commit the transaction (incorrectly), but the transaction may be in an invalid state.
2. Connection Pooling Masking Transaction State
The connection string enables pooling (Pooling=True;Max Pool Size=100
). When a connection is returned to the pool, SQLite resets its state, including rolling back uncommitted transactions. However, if the application does not explicitly close connections, pooled connections may retain incomplete transactions, leading to inconsistent behavior across runs.
3. Asynchronous Code Antipatterns
The use of ReadBlockAsync(...).Result
(synchronous blocking on async code) risks deadlocks or thread-pool starvation, preventing the transaction from reaching the commit phase. This is exacerbated in high-throughput scenarios where thread-pool threads are scarce.
4. Exception Handling That Commits Incomplete Transactions
The catch
block in the FTP code erroneously calls trans.Commit()
during exceptions, attempting to persist partial data. This violates atomicity: transactions should either fully commit or roll back. SQLite may ignore commits on invalid transactions, leading to implicit rollbacks.
Resolving Uncommitted Transactions and Ensuring Data Persistence
Step 1: Validate Transaction Commit Workflow
Modify the transaction handling to guarantee commits:
// Before: Risk of uncommitted transactions on exceptions
trans = updcmd.Connection.BeginTransaction();
try
{
// Process data
trans.Commit();
}
catch
{
trans.Commit(); // Incorrect!
}
Corrected Code:
using (var trans = updcmd.Connection.BeginTransaction())
{
try
{
// Process data
trans.Commit();
}
catch
{
trans.Rollback(); // Enforce rollback on failure
throw;
}
}
The using
block ensures the transaction is disposed (and rolled back if not committed), even if exceptions occur.
Step 2: Disable Connection Pooling During Debugging
Temporarily disable pooling to isolate transaction issues:
private const string ConnectionString = "DataSource=c:\\tmp\\sl.ql;Version=3;Pooling=False;";
This forces physical connection closes, eliminating pooled connection reuse.
Step 3: Audit Exception Handling for Premature Commits
Replace all trans.Commit()
calls in catch
blocks with trans.Rollback()
:
catch (Exception ex)
{
if (trans != null)
{
trans.Rollback(); // Correct action
trans.Dispose();
}
// Log error
}
Step 4: Verify Asynchronous Code Execution
Replace synchronous blocking on async methods with proper async/await patterns:
// Before: Risk of deadlocks
int read = readStream.ReadBlockAsync(chars, writeFrom, writeLength).Result;
// After: Async-compatible flow
int read = await readStream.ReadBlockAsync(chars, writeFrom, writeLength);
Ensure the entire method chain supports async to prevent thread starvation.
Step 5: Enable Write-Ahead Loging (WAL) for Concurrent Writes
SQLite’s default rollback journal locks the database during writes, which can cause contention in high-throughput scenarios. Enable WAL mode:
using (var cmd = new SQLiteCommand("PRAGMA journal_mode=WAL;", con))
{
cmd.ExecuteNonQuery();
}
WAL allows concurrent reads and writes, reducing the chance of transaction timeouts.
Step 6: Profile Database File Integrity
After application shutdown, inspect the database:
- Use
sqlite3 sl.ql "PRAGMA integrity_check;"
to detect corruption. - Check file size: A 20KB file indicates no user data (only schema).
- Query row counts:
SELECT COUNT(*) FROM Device;
Step 7: Implement Idle Connection Validation
Configure the connection string to validate pooled connections:
DataSource=c:\tmp\sl.ql;Version=3;Pooling=True;Max Pool Size=100;
Connection Timeout=30;
This ensures stale connections with pending transactions are reset.
Step 8: Log Transaction Lifecycle Events
Instrument the code to log transaction start, commit, and rollback events:
trans = con.BeginTransaction();
Logger.Log("Transaction started");
// ...
trans.Commit();
Logger.Log("Transaction committed");
Cross-reference logs with database file states to identify missing commits.
Final Validation
- Run the application with logging enabled.
- After shutdown, inspect the database file size and contents.
- Restart the application: The database should retain its size and data.
By rigorously enforcing transactional boundaries, correcting async antipatterns, and validating connection states, the database will persist data across application restarts.