Optimizing Sqlite3 for 20,000+ Updates Optimizing Sqlite3 for 20,000+ Updates sqlite sqlite

Optimizing Sqlite3 for 20,000+ Updates


I think your bottleneck is that you commit with/avec each insert/update:

I commit each transaction right after the update/insert.

Either stop doing that, or at least switch to WAL journaling; see this answer of mine for why:SQL Server CE 4.0 performance comparison

If you have a primary key you can optimize out the select by using the ON CONFLICT clause with INSERT INTO:

http://www.sqlite.org/lang_conflict.html

EDIT : Earlier I meant to write "if you have a primary key " rather than foreign key; I fixed it.


Edit: shame on me. I misread the question and somehow understood this was for mySQL rather that SQLite... Oops.
Please disregard this response, other than to get generic ideas about upating DBMSes. The likely solution to the OP's problem is with the overly frequent commits, as pointed in sixfeetsix' response.


A plausible explanation is that the table gets fragmented.
You can verify this fact by defragmenting the table every so often, and checking if the performance returns to the 3 or 4 items per seconds rate. (Which BTW, is a priori relatively slow, but then may depend on hardware, data schema and other specifics.) Of course, you'll need to consider the amount of time defragmentation takes, and balance this against the time lost by slow update rate to find an optimal frequency for the defragmentation.

If the slowdown is effectively caused, at least in part, by fragmentation, you may also look into performing the updates in a particular order. It is hard to be more specific without knowing details of the schema of of the overall and data statistical profile, but fragmentation is indeed sensitive to the order in which various changes to the database take place.

A final suggestion, to boost the overall update performance, is (if this is possible) drop a few indexes on the table, perform the updates, and recreate the indexes anew. This counter-intuitive approach works for relative big updates because the cost for re-creating new indexes is often less that the cumulative cost for maintaining them as the update progresses.