must appear in the GROUP BY clause or be used in an aggregate function must appear in the GROUP BY clause or be used in an aggregate function sql sql

must appear in the GROUP BY clause or be used in an aggregate function


Yes, this is a common aggregation problem. Before SQL3 (1999), the selected fields must appear in the GROUP BY clause[*].

To workaround this issue, you must calculate the aggregate in a sub-query and then join it with itself to get the additional columns you'd need to show:

SELECT m.cname, m.wmname, t.mxFROM (    SELECT cname, MAX(avg) AS mx    FROM makerar    GROUP BY cname    ) t JOIN makerar m ON m.cname = t.cname AND t.mx = m.avg; cname  | wmname |          mx           --------+--------+------------------------ canada | zoro   |     2.0000000000000000 spain  | usopp  |     5.0000000000000000

But you may also use window functions, which looks simpler:

SELECT cname, wmname, MAX(avg) OVER (PARTITION BY cname) AS mxFROM makerar;

The only thing with this method is that it will show all records (window functions do not group). But it will show the correct (i.e. maxed at cname level) MAX for the country in each row, so it's up to you:

 cname  | wmname |          mx           --------+--------+------------------------ canada | zoro   |     2.0000000000000000 spain  | luffy  |     5.0000000000000000 spain  | usopp  |     5.0000000000000000

The solution, arguably less elegant, to show the only (cname, wmname) tuples matching the max value, is:

SELECT DISTINCT /* distinct here matters, because maybe there are various tuples for the same max value */    m.cname, m.wmname, t.avg AS mxFROM (    SELECT cname, wmname, avg, ROW_NUMBER() OVER (PARTITION BY avg DESC) AS rn     FROM makerar) t JOIN makerar m ON m.cname = t.cname AND m.wmname = t.wmname AND t.rn = 1; cname  | wmname |          mx           --------+--------+------------------------ canada | zoro   |     2.0000000000000000 spain  | usopp  |     5.0000000000000000

[*]: Interestingly enough, even though the spec sort of allows to select non-grouped fields, major engines seem to not really like it. Oracle and SQLServer just don't allow this at all. Mysql used to allow it by default, but now since 5.7 the administrator needs to enable this option (ONLY_FULL_GROUP_BY) manually in the server configuration for this feature to be supported...


In Postgres, you can also use the special DISTINCT ON (expression) syntax:

SELECT DISTINCT ON (cname)     cname, wmname, avgFROM     makerar ORDER BY     cname, avg DESC ;


The problem with specifying non-grouped and non-aggregate fields in group by selects is that engine has no way of knowing which record's field it should return in this case. Is it first? Is it last? There is usually no record that naturally corresponds to aggregated result (min and max are exceptions).

However, there is a workaround: make the required field aggregated as well.In posgres, this should work:

SELECT cname, (array_agg(wmname ORDER BY avg DESC))[1], MAX(avg)FROM makerar GROUP BY cname;

Note that this creates an array of all wnames, ordered by avg, and returns the first element (arrays in postgres are 1-based).