SQLAlchemy: What's the difference between flush() and commit()?
A Session object is basically an ongoing transaction of changes to a database (update, insert, delete). These operations aren't persisted to the database until they are committed (if your program aborts for some reason in mid-session transaction, any uncommitted changes within are lost).
The session object registers transaction operations with
session.add(), but doesn't yet communicate them to the database until
session.flush() is called.
session.flush() communicates a series of operations to the database (insert, update, delete). The database maintains them as pending operations in a transaction. The changes aren't persisted permanently to disk, or visible to other transactions until the database receives a COMMIT for the current transaction (which is what
session.commit() commits (persists) those changes to the database.
flush() is always called as part of a call to
When you use a Session object to query the database, the query will return results both from the database and from the flushed parts of the uncommitted transaction it holds. By default, Session objects
autoflush their operations, but this can be disabled.
Hopefully this example will make this clearer:
#---s = Session()s.add(Foo('A')) # The Foo('A') object has been added to the session. # It has not been committed to the database yet, # but is returned as part of a query.print 1, s.query(Foo).all()s.commit()#---s2 = Session()s2.autoflush = Falses2.add(Foo('B'))print 2, s2.query(Foo).all() # The Foo('B') object is *not* returned # as part of this query because it hasn't # been flushed yet.s2.flush() # Now, Foo('B') is in the same state as # Foo('A') was above.print 3, s2.query(Foo).all() s2.rollback() # Foo('B') has not been committed, and rolling # back the session's transaction removes it # from the session.print 4, s2.query(Foo).all()#---Output:1 [<Foo('A')>]2 [<Foo('A')>]3 [<Foo('A')>, <Foo('B')>]4 [<Foo('A')>]
As @snapshoe says
flush()sends your SQL statements to the database
commit()commits the transaction.
session.autocommit == False:
commit() will call
flush() if you set
autoflush == True.
session.autocommit == True:
You can't call
commit() if you haven't started a transaction (which you probably haven't since you would probably only use this mode to avoid manually managing transactions).
In this mode, you must call
flush() to save your ORM changes. The flush effectively also commits your data.
This does not strictly answer the original question but some people have mentioned that with
session.autoflush = True you don't have to use
session.flush()... And this is not always true.
If you want to use the id of a newly created object in the middle of a transaction, you must call
# Given a model with at least this idclass AModel(Base): id = Column(Integer, primary_key=True) # autoincrement by default on integer primary keysession.autoflush = Truea = AModel()session.add(a)a.id # Nonesession.flush()a.id # autoincremented integer
This is because
autoflush does NOT auto fill the id (although a query of the object will, which sometimes can cause confusion as in "why this works here but not there?" But snapshoe already covered this part).
One related aspect that seems pretty important to me and wasn't really mentioned:
Why would you not commit all the time? - The answer is atomicity.
A fancy word to say: an ensemble of operations have to all be executed successfully OR none of them will take effect.
For example, if you want to create/update/delete some object (A) and then create/update/delete another (B), but if (B) fails you want to revert (A). This means those 2 operations are atomic.
Therefore, if (B) needs a result of (A), you want to call
flush after (A) and
commit after (B).
session.autoflush is True, except for the case that I mentioned above or others in Jimbo's answer, you will not need to call