SQLAlchemy: What's the difference between flush() and commit()? SQLAlchemy: What's the difference between flush() and commit()? python python

SQLAlchemy: What's the difference between flush() and commit()?


A Session object is basically an ongoing transaction of changes to a database (update, insert, delete). These operations aren't persisted to the database until they are committed (if your program aborts for some reason in mid-session transaction, any uncommitted changes within are lost).

The session object registers transaction operations with session.add(), but doesn't yet communicate them to the database until session.flush() is called.

session.flush() communicates a series of operations to the database (insert, update, delete). The database maintains them as pending operations in a transaction. The changes aren't persisted permanently to disk, or visible to other transactions until the database receives a COMMIT for the current transaction (which is what session.commit() does).

session.commit() commits (persists) those changes to the database.

flush() is always called as part of a call to commit() (1).

When you use a Session object to query the database, the query will return results both from the database and from the flushed parts of the uncommitted transaction it holds. By default, Session objects autoflush their operations, but this can be disabled.

Hopefully this example will make this clearer:

#---s = Session()s.add(Foo('A')) # The Foo('A') object has been added to the session.                # It has not been committed to the database yet,                #   but is returned as part of a query.print 1, s.query(Foo).all()s.commit()#---s2 = Session()s2.autoflush = Falses2.add(Foo('B'))print 2, s2.query(Foo).all() # The Foo('B') object is *not* returned                             #   as part of this query because it hasn't                             #   been flushed yet.s2.flush()                   # Now, Foo('B') is in the same state as                             #   Foo('A') was above.print 3, s2.query(Foo).all() s2.rollback()                # Foo('B') has not been committed, and rolling                             #   back the session's transaction removes it                             #   from the session.print 4, s2.query(Foo).all()#---Output:1 [<Foo('A')>]2 [<Foo('A')>]3 [<Foo('A')>, <Foo('B')>]4 [<Foo('A')>]


As @snapshoe says

flush() sends your SQL statements to the database

commit() commits the transaction.

When session.autocommit == False:

commit() will call flush() if you set autoflush == True.

When session.autocommit == True:

You can't call commit() if you haven't started a transaction (which you probably haven't since you would probably only use this mode to avoid manually managing transactions).

In this mode, you must call flush() to save your ORM changes. The flush effectively also commits your data.


This does not strictly answer the original question but some people have mentioned that with session.autoflush = True you don't have to use session.flush()... And this is not always true.

If you want to use the id of a newly created object in the middle of a transaction, you must call session.flush().

# Given a model with at least this idclass AModel(Base):   id = Column(Integer, primary_key=True)  # autoincrement by default on integer primary keysession.autoflush = Truea = AModel()session.add(a)a.id  # Nonesession.flush()a.id  # autoincremented integer

This is because autoflush does NOT auto fill the id (although a query of the object will, which sometimes can cause confusion as in "why this works here but not there?" But snapshoe already covered this part).


One related aspect that seems pretty important to me and wasn't really mentioned:

Why would you not commit all the time? - The answer is atomicity.

A fancy word to say: an ensemble of operations have to all be executed successfully OR none of them will take effect.

For example, if you want to create/update/delete some object (A) and then create/update/delete another (B), but if (B) fails you want to revert (A). This means those 2 operations are atomic.

Therefore, if (B) needs a result of (A), you want to call flush after (A) and commit after (B).

Also, if session.autoflush is True, except for the case that I mentioned above or others in Jimbo's answer, you will not need to call flush manually.