Is get_or_create() thread safe Is get_or_create() thread safe multithreading multithreading

Is get_or_create() thread safe


NO, get_or_create is not atomic.

It first asks the DB if a satisfying row exists; database returns, python checks results; if it doesn't exist, it creates it. In between the get and the create anything can happen - and a row corresponding to the get criteria be created by some other code.

For instance wrt to your specific issue if two pages are open by the user (or several ajax requests are performed) at the same time this might cause all get to fail, and for all of them to create a new row - with the same session.

It is thus important to only use get_or_create when the duplication issue will be caught by the database through some unique/unique_together, so that even though multiple threads can get to the point of save(), only one will succeed, and the others will raise an IntegrityError that you can catch and deal with.

If you use get_or_create with (a set of) fields that are not unique in the database you will create duplicates in your database, which is rarely what you want.

More in general: do not rely on your application to enforce uniqueness and avoid duplicates in your database! THat's the database job!(well unless you wrap your critical functions with some OS-valid locks, but I would still suggest to use the database).

With thes warnings, used correctly get_or_create is an easy to read, easy to write construct that perfectly complements the database integrity checks.

Refs and citations:


Actualy it's not thread-safe, you can look at the code of the get_or_create method of the QuerySet object, basicaly what it does is the following :

try:    return self.get(**lookup), Falseexcept self.model.DoesNotExist:    params = dict([(k, v) for k, v in kwargs.items() if '__' not in k])    params.update(defaults)    obj = self.model(**params)    sid = transaction.savepoint(using=self.db)    obj.save(force_insert=True, using=self.db)    transaction.savepoint_commit(sid, using=self.db)    return obj, True

So two threads might figure-out that the instance does not exists in the DB and start creating a new one, before saving them consecutively.