Django select only rows with duplicate field values Django select only rows with duplicate field values django django

Django select only rows with duplicate field values


Try:

from django.db.models import CountLiteral.objects.values('name')               .annotate(Count('id'))                .order_by()               .filter(id__count__gt=1)

This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:

dupes = Literal.objects.values('name')                       .annotate(Count('id'))                       .order_by()                       .filter(id__count__gt=1)Literal.objects.filter(name__in=[item['name'] for item in dupes])


This was rejected as an edit. So here it is as a better answer

dups = (    Literal.objects.values('name')    .annotate(count=Count('id'))    .values('name')    .order_by()    .filter(count__gt=1))

This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django ORM is smart enough to combine these into a single query:

Literal.objects.filter(name__in=dups)

The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the ORM into only selecting the name column for the subquery.


try using aggregation

Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)