How can I get all rows with keys provided in a list using SQLalchemy? How can I get all rows with keys provided in a list using SQLalchemy? python python

How can I get all rows with keys provided in a list using SQLalchemy?


Your code is absolutety fine.

IN is like a bunch of X=Y joined with OR and is pretty fast in contemporary databases.

However, if your list of IDs is long, you could make the query a bit more efficient by passing a sub-query returning the list of IDs.


The code as is is completely fine. However, someone is asking me for some system of hedging between the two approaches of doing a big IN vs. using get() for individual IDs.

If someone is really trying to avoid the SELECT, then the best way to do that is to set up the objects you need in memory ahead of time. Such as, you're working on a large table of elements. Break up the work into chunks, such as, order the full set of work by primary key, or by date range, whatever, then load everything for that chunk locally into a cache:

 all_ids = [<huge list of ids>] all_ids.sort() while all_ids:     chunk = all_ids[0:1000]     # bonus exercise!  Throw each chunk into a multiprocessing.pool()!     all_ids = all_ids[1000:]     my_cache = dict(           Session.query(Record.id, Record).filter(                 Record.id.between(chunk[0], chunk[-1]))     )     for id_ in chunk:         my_obj = my_cache[id_]         <work on my_obj>

That's the real world use case.

But to also illustrate some SQLAlchemy API, we can make a function that does the IN for records we don't have and a local get for those we do. Here is that:

from sqlalchemy import inspectdef get_all(session, cls, seq):    mapper = inspect(cls)    lookup = set()    for ident in seq:        key = mapper.identity_key_from_primary_key((ident, ))        if key in session.identity_map:            yield session.identity_map[key]        else:            lookup.add(ident)    if lookup:        for obj in session.query(cls).filter(cls.id.in_(lookup)):            yield obj

Here is a demonstration:

from sqlalchemy import Column, Integer, create_engine, Stringfrom sqlalchemy.orm import Sessionfrom sqlalchemy.ext.declarative import declarative_baseimport randomBase = declarative_base()class A(Base):    __tablename__ = 'a'    id = Column(Integer, primary_key=True)    data = Column(String)e = create_engine("sqlite://", echo=True)Base.metadata.create_all(e)ids = range(1, 50)s = Session(e)s.add_all([A(id=i, data='a%d' % i) for i in ids])s.commit()s.close()already_loaded = s.query(A).filter(A.id.in_(random.sample(ids, 10))).all()assert len(s.identity_map) == 10to_load = set(random.sample(ids, 25))all_ = list(get_all(s, A, to_load))assert set(x.id for x in all_) == to_load


If you use composite primary keys, you can use tuple_, as in

from sqlalchemy import tuple_session.query(Record).filter(tuple_(Record.id1, Record.id2).in_(seq)).all()

Note that this is not available on SQLite (see doc).