Annotating with Count not working how I expect it to Annotating with Count not working how I expect it to django django

Annotating with Count not working how I expect it to


1. Quick explanation

When you test a many-to-many relation against None in a Django queryset, as you do here:

Q(profilearmor__profile=None)

that matches rows where there are no corresponding rows in the many-to-many relation. So your query

self.armor_category.armor_set.filter(    Q(profilearmor__profile=self.request.user.profile) |    Q(profilearmor__profile=None))

matches items of armor which either the self.request.user has access to, or which no-one has access to. That's why your query for the second user failed to return a row for the Duck Helm: because someone (namely the first user) had access to it.

2. What to do instead

The query that you want to run looks like this in SQL:

SELECT `myapp_armor`.`id`, `myapp_armor`.`armor_category_id`, `myapp_armor`.`name`,       COUNT(`myapp_profilearmor`.`id`) AS `profile_armor_count`FROM `myapp_armor`LEFT OUTER JOIN `myapp_profilearmor`             ON `myapp_armor`.`id` = `myapp_profilearmor`.`armor_id`                AND `myapp_profilearmor`.`profile_id` = %sWHERE `myapp_armor`.`armor_category_id` = %sGROUP BY `myapp_armor`.`id`, `myapp_armor`.`armor_category_id`, `myapp_armor`.`name`

Unfortunately, Django's object-relational mapping system doesn't seem to provide a way to express this query. But you can always bypass the ORM and issue a raw SQL query. Like this:

sql = '''    SELECT `myapp_armor`.`id`, `myapp_armor`.`armor_category_id`, `myapp_armor`.`name`,           COUNT(`myapp_profilearmor`.`id`) AS `profile_armor_count`    FROM `myapp_armor`    LEFT OUTER JOIN `myapp_profilearmor`                 ON `myapp_armor`.`id` = `myapp_profilearmor`.`armor_id`                    AND `myapp_profilearmor`.`profile_id` = %s    WHERE `myapp_armor`.`armor_category_id` = %s    GROUP BY `myapp_armor`.`id`, `myapp_armor`.`armor_category_id`, `myapp_armor`.`name`'''armor = Armor.objects.raw(sql, [self.request.user.profile.id, self.armor_category.id])for armor_item in armor:    print('{:14}{}'.format(armor_item.name, armor_item.profile_armor_count))

For example:

>>> helmets = ArmorCategory.objects.get(id=1)>>> profile = Profile.objects.get(id=1)>>> armor = Armor.objects.raw(sql, [profile.id, helmets.id])>>> for armor_item in armor:...     print('{:14}{}'.format(armor_item.name, armor_item.profile_armor_count))... Dragon Helm   0Duck Helm     1Needle Helm   0

3. How you could have figured this out for yourself

Here's the offending Django query:

armor = self.armor_category.armor_set.filter(    Q(profilearmor__profile=self.request.user.profile) |    Q(profilearmor__profile=None)).annotate(profile_armor_count=Count('profilearmor__id'))

When you don't understand why a query is producing the wrong results, it's always worth looking at the actual SQL, which you can do by taking the queryset's query attribute and converting it to a string:

>>> from django.db.models import Q>>> helmets = ArmorCategory.objects.get(name='Helmets')>>> profile = Profile.objects.get(id=1)>>> print(helmets.armor_set.filter(Q(profilearmor__profile=profile) |...                                Q(profilearmor__profile=None)... ).annotate(profile_armor_count=Count('profilearmor__id')).query)SELECT `myapp_armor`.`id`, `myapp_armor`.`armor_category_id`, `myapp_armor`.`name`,       COUNT(`myapp_profilearmor`.`id`) AS `profile_armor_count`FROM `myapp_armor`LEFT OUTER JOIN `myapp_profilearmor`             ON (`myapp_armor`.`id` = `myapp_profilearmor`.`armor_id`) LEFT OUTER JOIN `myapp_profile`             ON (`myapp_profilearmor`.`profile_id` = `myapp_profile`.`id`) WHERE (`myapp_armor`.`armor_category_id` = 1       AND (`myapp_profilearmor`.`profile_id` = 1            OR `myapp_profile`.`id` IS NULL))GROUP BY `myapp_armor`.`id`, `myapp_armor`.`armor_category_id`, `myapp_armor`.`name`ORDER BY NULL

What does this mean? If you know all about SQL joins, you can skip to section 5. Otherwise, read on.

4. Introduction to SQL joins

As I'm sure you know, a join of tables in SQL consists of combinations of rows from those tables (subject to some conditions that you specify in your query). For example, if you have two categories of armor and six items of armor:

mysql> SELECT * FROM myapp_armorcategory;+----+---------+---------+| id | name    | slug    |+----+---------+---------+|  1 | Helmets | helmets ||  2 | Suits   | suits   |+----+---------+---------+2 rows in set (0.00 sec)mysql> SELECT * FROM myapp_armor;+----+-------------------+-------------+| id | armor_category_id | name        |+----+-------------------+-------------+|  1 |                 1 | Dragon Helm ||  2 |                 1 | Duck Helm   ||  3 |                 1 | Needle Helm ||  4 |                 2 | Spiky Suit  ||  5 |                 2 | Flower Suit ||  6 |                 2 | Battle Suit |+----+-------------------+-------------+6 rows in set (0.00 sec)

then a JOIN of those two tables will contain all 12 combinations of the 2 rows from the first table and the 6 rows of the second table:

mysql> SELECT * FROM myapp_armorcategory JOIN myapp_armor;+----+---------+---------+----+-------------------+-------------+| id | name    | slug    | id | armor_category_id | name        |+----+---------+---------+----+-------------------+-------------+|  1 | Helmets | helmets |  1 |                 1 | Dragon Helm ||  2 | Suits   | suits   |  1 |                 1 | Dragon Helm ||  1 | Helmets | helmets |  2 |                 1 | Duck Helm   ||  2 | Suits   | suits   |  2 |                 1 | Duck Helm   ||  1 | Helmets | helmets |  3 |                 1 | Needle Helm ||  2 | Suits   | suits   |  3 |                 1 | Needle Helm ||  1 | Helmets | helmets |  4 |                 2 | Spiky Suit  ||  2 | Suits   | suits   |  4 |                 2 | Spiky Suit  ||  1 | Helmets | helmets |  5 |                 2 | Flower Suit ||  2 | Suits   | suits   |  5 |                 2 | Flower Suit ||  1 | Helmets | helmets |  6 |                 2 | Battle Suit ||  2 | Suits   | suits   |  6 |                 2 | Battle Suit |+----+---------+---------+----+-------------------+-------------+12 rows in set (0.00 sec)

Normally one adds conditions to the join to restrict the rows returned so that they make sense. For example, when joining the armor category table with the armor table we're only interested in combinations where the armor belongs to the armor category:

mysql> SELECT * FROM myapp_armorcategory JOIN myapp_armor           ON myapp_armorcategory.id = myapp_armor.armor_category_id;+----+---------+---------+----+-------------------+-------------+| id | name    | slug    | id | armor_category_id | name        |+----+---------+---------+----+-------------------+-------------+|  1 | Helmets | helmets |  1 |                 1 | Dragon Helm ||  1 | Helmets | helmets |  2 |                 1 | Duck Helm   ||  1 | Helmets | helmets |  3 |                 1 | Needle Helm ||  2 | Suits   | suits   |  4 |                 2 | Spiky Suit  ||  2 | Suits   | suits   |  5 |                 2 | Flower Suit ||  2 | Suits   | suits   |  6 |                 2 | Battle Suit |+----+---------+---------+----+-------------------+-------------+6 rows in set (0.08 sec)

This is all straightforward. But a problem arises where there are records in one table that have no matching item in the other. Let's add a new armor category with no corresponding items of armor:

mysql> INSERT INTO myapp_armorcategory (name, slug) VALUES ('Arm Guards', 'armguards');Query OK, 1 row affected (0.00 sec)

If we re-run the query above (SELECT * FROM myapp_armorcategory JOIN myapp_armor ON myapp_armorcategory.id = myapp_armor.armor_category_id;) we'll get the same result: the new armor category has no matching records in the armor table so it doesn't appear in the JOIN. If we want to see all armor categories appear in the results, regardless of whether they have a matching row in the other table, we have to run a so-called outer join: in particular a LEFT OUTER JOIN:

mysql> SELECT * FROM myapp_armorcategory LEFT OUTER JOIN myapp_armor           ON myapp_armorcategory.id = myapp_armor.armor_category_id;+----+------------+-----------+------+-------------------+-------------+| id | name       | slug      | id   | armor_category_id | name        |+----+------------+-----------+------+-------------------+-------------+|  1 | Helmets    | helmets   |    1 |                 1 | Dragon Helm ||  1 | Helmets    | helmets   |    2 |                 1 | Duck Helm   ||  1 | Helmets    | helmets   |    3 |                 1 | Needle Helm ||  2 | Suits      | suits     |    4 |                 2 | Spiky Suit  ||  2 | Suits      | suits     |    5 |                 2 | Flower Suit ||  2 | Suits      | suits     |    6 |                 2 | Battle Suit ||  3 | Arm Guards | armguards | NULL |              NULL | NULL        |+----+------------+-----------+------+-------------------+-------------+7 rows in set (0.00 sec)

For each row in the left-hand table that has no matching rows in the right-hand table, the left outer join contains a row with NULL for each column on the right.

5. Explaining why the query goes wrong

Let's create a single profile who has access to the Duck Helm:

mysql> INSERT INTO myapp_profile (name) VALUES ('user1');Query OK, 1 row affected (0.00 sec)mysql> INSERT INTO myapp_profilearmor (profile_id, armor_id) VALUES (1, 2);Query OK, 1 row affected (0.00 sec)

Then let's run a simplified version of your query to try to understand what it is doing:

mysql> SELECT `myapp_armor`.*, `myapp_profilearmor`.*, T5.*        FROM `myapp_armor`        LEFT OUTER JOIN `myapp_profilearmor`                    ON (`myapp_armor`.`id` = `myapp_profilearmor`.`armor_id`)        LEFT OUTER JOIN `myapp_profile` T5                    ON (`myapp_profilearmor`.`profile_id` = T5.`id`)        WHERE `myapp_armor`.`armor_category_id` = 1;+----+-------------------+-------------+------+------------+----------+------+-------+| id | armor_category_id | name        | id   | profile_id | armor_id | id   | name  |+----+-------------------+-------------+------+------------+----------+------+-------+|  1 |                 1 | Dragon Helm | NULL |       NULL |     NULL | NULL | NULL  ||  2 |                 1 | Duck Helm   |    1 |          1 |        1 |    1 | user1 ||  3 |                 1 | Needle Helm | NULL |       NULL |     NULL | NULL | NULL  |+----+-------------------+-------------+------+------------+----------+------+-------+3 rows in set (0.04 sec)

So when you add the condition (myapp_profilearmor.profile_id = 1 OR T5.id IS NULL) then you'll get all the rows from this join.

Now let's create a second profile with access to the Dragon Helm:

mysql> INSERT INTO myapp_profile (name) VALUES ('user2');Query OK, 1 row affected (0.00 sec)mysql> INSERT INTO myapp_profilearmor (profile_id, armor_id) VALUES (2, 1);Query OK, 1 row affected, 1 warning (0.09 sec)

And re-run the join:

mysql> SELECT `myapp_armor`.*, `myapp_profilearmor`.*, T5.*        FROM `myapp_armor`        LEFT OUTER JOIN `myapp_profilearmor`                    ON (`myapp_armor`.`id` = `myapp_profilearmor`.`armor_id`)        LEFT OUTER JOIN `myapp_profile` T5                    ON (`myapp_profilearmor`.`profile_id` = T5.`id`)        WHERE `myapp_armor`.`armor_category_id` = 1;+----+-------------------+-------------+------+------------+----------+------+-------+| id | armor_category_id | name        | id   | profile_id | armor_id | id   | name  |+----+-------------------+-------------+------+------------+----------+------+-------+|  1 |                 1 | Dragon Helm |    2 |          2 |        1 |    2 | user2 ||  2 |                 1 | Duck Helm   |    1 |          1 |        2 |    1 | user1 ||  3 |                 1 | Needle Helm | NULL |       NULL |     NULL | NULL | NULL  |+----+-------------------+-------------+------+------------+----------+------+-------+3 rows in set (0.00 sec)

Now you can see that when you run the query for the second user profile, the extra condition on the join will be (myapp_profilearmor.profile_id = 2 OR T5.id IS NULL) and this will select only rows 1 and 3. Row 2 will be missing.

So you can see that your original filter Q(profilearmor__profile=None) became the subclause T5.id IS NULL and this only selects rows where there is no entry (for any profile) in the ProfileArmor relation.

6. Other comments on your code

  1. You test to see if the profile has access to any type of armor in a category using the count() method on the queryset:

    if self.armor_category.armor_set.filter(    profilearmor__profile=self.request.user.profile).count() == 0:

    but since you don't actually care about the number of types of armor in the category which the profile has access to, just whether there's any, you should use the exists() method instead.

  2. Having gone to the trouble to set up a ManyToManyField on the Armor model, why not use it? That is, instead of querying the Armor model like this:

    Q(profilearmor__profile = ...)

    you can run the same query like this and save some typing:

    Q(profile = ...)

7. Postscript

This was an excellent question. All too often, Django questions don't provide enough detail of the models, for anyone to be able to answer with confidence.