Efficient sqlite query based on list of primary keys Efficient sqlite query based on list of primary keys sqlite sqlite

Efficient sqlite query based on list of primary keys


If the number of elements in the IN is large enough, SQLite constructs a temporary index for them. This is likely to be more efficient than creating a temporary table manually.

The length of the IN list is limited only be the maximum length of an SQL statement, and by memory.


Because the statement you wrote does not include any instructions to SQLite about how to find the rows you want the concept of "optimizing" doesn't really exist -- there's nothing to optimize. The job of planning the best algorithm to retrieve the data belongs to the SQLite query optimizer.

Some databases do have idiosyncrasies in their query optimizers which can lead to performance issues but I wouldn't expect SQLite to have any trouble finding the correct algorithm for this simple query, even with lots of values in the IN list. I would only worry about trying to guide the query optimizer to another execution plan if and when you find that there's a performance problem.


SQLite Optimizer Overview

IN (expression-list) does use an index if available.

Beyond that, I can't glean any guarantees from it, so the following is subject to a performance measaurement.

Axis 1: how to pass the expression-list

  • hardocde as string. Overhead for int-to-string conversion and string-to-int parsing
  • bind parameters (i.e. the statement is ... WHERE CustomerID in (?,?,?,?,?,?,?,?,?,?....), which is easier to build from a predefined string than hardcoded values). Prevents int → string → int conversion, but the default limit for number of parameters is 999. This can be increased by SQLITE_LIMIT_VARIABLE_NUMBER, but might lead to excessive allocations.
  • Temporary table. Possibly less efficient than any of the above methods after the statement is prepared, but that doesn't help if most time is spent preparing the statement

Axis 2: Statement optimization

If the same expression-list is used in multiple queries against changing CustomerIDs, one of the following may help:

  • reusing a prepared statement with hardcoded values (i.e. don't pass 1001 parameters)
  • create a temporary table for the CustomerIDs with index (so the index is created once, not on the fly for every query)

If the expression-list is different with every query, ist is probably best to let SQLite do its job. The following might be an improvement

  • create a temp table for the expression-list
  • bulk-insert expression-list elements using union all
  • use a sub query

(from my experience with SQLite, I'd expect it to be on par or slightly worse)


Axis 3 Ask Richard

the sqlite mailing list (yeah I know, that technology even older than rotary phones!) is pretty active with often excellent advise, including from the author of SQLite. 90% chance someone will dismiss you ass "Measure before asking suhc a question!", 10% chance someone gives you detailed insight.