Django filter queryset __in for *every* item in list Django filter queryset __in for *every* item in list python python

Django filter queryset __in for *every* item in list


Summary:

One option is, as suggested by jpic and sgallen in the comments, to add .filter() for each category. Each additional filter adds more joins, which should not be a problem for small set of categories.

There is the aggregation approach. This query would be shorter and perhaps quicker for a large set of categories.

You also have the option of using custom queries.


Some examples

Test setup:

class Photo(models.Model):    tags = models.ManyToManyField('Tag')class Tag(models.Model):    name = models.CharField(max_length=50)    def __unicode__(self):        return self.nameIn [2]: t1 = Tag.objects.create(name='holiday')In [3]: t2 = Tag.objects.create(name='summer')In [4]: p = Photo.objects.create()In [5]: p.tags.add(t1)In [6]: p.tags.add(t2)In [7]: p.tags.all()Out[7]: [<Tag: holiday>, <Tag: summer>]

Using chained filters approach:

In [8]: Photo.objects.filter(tags=t1).filter(tags=t2)Out[8]: [<Photo: Photo object>]

Resulting query:

In [17]: print Photo.objects.filter(tags=t1).filter(tags=t2).querySELECT "test_photo"."id"FROM "test_photo"INNER JOIN "test_photo_tags" ON ("test_photo"."id" = "test_photo_tags"."photo_id")INNER JOIN "test_photo_tags" T4 ON ("test_photo"."id" = T4."photo_id")WHERE ("test_photo_tags"."tag_id" = 3  AND T4."tag_id" = 4 )

Note that each filter adds more JOINS to the query.

Using annotation approach:

In [29]: from django.db.models import CountIn [30]: Photo.objects.filter(tags__in=[t1, t2]).annotate(num_tags=Count('tags')).filter(num_tags=2)Out[30]: [<Photo: Photo object>]

Resulting query:

In [32]: print Photo.objects.filter(tags__in=[t1, t2]).annotate(num_tags=Count('tags')).filter(num_tags=2).querySELECT "test_photo"."id", COUNT("test_photo_tags"."tag_id") AS "num_tags"FROM "test_photo"LEFT OUTER JOIN "test_photo_tags" ON ("test_photo"."id" = "test_photo_tags"."photo_id")WHERE ("test_photo_tags"."tag_id" IN (3, 4))GROUP BY "test_photo"."id", "test_photo"."id"HAVING COUNT("test_photo_tags"."tag_id") = 2

ANDed Q objects would not work:

In [9]: from django.db.models import QIn [10]: Photo.objects.filter(Q(tags__name='holiday') & Q(tags__name='summer'))Out[10]: []In [11]: from operator import and_In [12]: Photo.objects.filter(reduce(and_, [Q(tags__name='holiday'), Q(tags__name='summer')]))Out[12]: []

Resulting query:

In [25]: print Photo.objects.filter(Q(tags__name='holiday') & Q(tags__name='summer')).querySELECT "test_photo"."id"FROM "test_photo"INNER JOIN "test_photo_tags" ON ("test_photo"."id" = "test_photo_tags"."photo_id")INNER JOIN "test_tag" ON ("test_photo_tags"."tag_id" = "test_tag"."id")WHERE ("test_tag"."name" = holiday  AND "test_tag"."name" = summer )


Another approach that works, although PostgreSQL only, is using django.contrib.postgres.fields.ArrayField:

Example copied from docs:

>>> Post.objects.create(name='First post', tags=['thoughts', 'django'])>>> Post.objects.create(name='Second post', tags=['thoughts'])>>> Post.objects.create(name='Third post', tags=['tutorial', 'django'])>>> Post.objects.filter(tags__contains=['thoughts'])<QuerySet [<Post: First post>, <Post: Second post>]>>>> Post.objects.filter(tags__contains=['django'])<QuerySet [<Post: First post>, <Post: Third post>]>>>> Post.objects.filter(tags__contains=['django', 'thoughts'])<QuerySet [<Post: First post>]>

ArrayField has some more powerful features such as overlap and index transforms.


This also can be done by dynamic query generation using Django ORM and some Python magic :)

from operator import and_from django.db.models import Qcategories = ['holiday', 'summer']res = Photo.filter(reduce(and_, [Q(tags__name=c) for c in categories]))

The idea is to generate appropriate Q objects for each category and then combine them using AND operator into one QuerySet. E.g. for your example it'd be equal to

res = Photo.filter(Q(tags__name='holiday') & Q(tags__name='summer'))