Using LIMIT within GROUP BY to get N results per group?
You could use GROUP_CONCAT aggregated function to get all years into a single column, grouped by id
and ordered by rate
:
SELECT id, GROUP_CONCAT(year ORDER BY rate DESC) grouped_yearFROM yourtableGROUP BY id
Result:
-----------------------------------------------------------| ID | GROUPED_YEAR |-----------------------------------------------------------| p01 | 2006,2003,2008,2001,2007,2009,2002,2004,2005,2000 || p02 | 2001,2004,2002,2003,2000,2006,2007 |-----------------------------------------------------------
And then you could use FIND_IN_SET, that returns the position of the first argument inside the second one, eg.
SELECT FIND_IN_SET('2006', '2006,2003,2008,2001,2007,2009,2002,2004,2005,2000');1SELECT FIND_IN_SET('2009', '2006,2003,2008,2001,2007,2009,2002,2004,2005,2000');6
Using a combination of GROUP_CONCAT
and FIND_IN_SET
, and filtering by the position returned by find_in_set, you could then use this query that returns only the first 5 years for every id:
SELECT yourtable.*FROM yourtable INNER JOIN ( SELECT id, GROUP_CONCAT(year ORDER BY rate DESC) grouped_year FROM yourtable GROUP BY id) group_max ON yourtable.id = group_max.id AND FIND_IN_SET(year, grouped_year) BETWEEN 1 AND 5ORDER BY yourtable.id, yourtable.year DESC;
Please see fiddle here.
Please note that if more than one row can have the same rate, you should consider using GROUP_CONCAT(DISTINCT rate ORDER BY rate) on the rate column instead of the year column.
The maximum length of the string returned by GROUP_CONCAT is limited, so this works well if you need to select a few records for every group.
The original query used user variables and ORDER BY
on derived tables; the behavior of both quirks is not guaranteed. Revised answer as follows.
In MySQL 5.x you can use poor man's rank over partition to achieve desired result. Just outer join the table with itself and for each row, count the number of rows lesser than it. In the above case, lesser row is the one with higher rate:
SELECT t.id, t.rate, t.year, COUNT(l.rate) AS rankFROM tLEFT JOIN t AS l ON t.id = l.id AND t.rate < l.rateGROUP BY t.id, t.rate, t.yearHAVING COUNT(l.rate) < 5ORDER BY t.id, t.rate DESC, t.year
| id | rate | year | rank ||-----|------|------|------|| p01 | 8.0 | 2006 | 0 || p01 | 7.4 | 2003 | 1 || p01 | 6.8 | 2008 | 2 || p01 | 5.9 | 2001 | 3 || p01 | 5.3 | 2007 | 4 || p02 | 12.5 | 2001 | 0 || p02 | 12.4 | 2004 | 1 || p02 | 12.2 | 2002 | 2 || p02 | 10.3 | 2003 | 3 || p02 | 8.7 | 2000 | 4 |
Note that if the rates had ties, for example:
100, 90, 90, 80, 80, 80, 70, 60, 50, 40, ...
The above query will return 6 rows:
100, 90, 90, 80, 80, 80
Change to HAVING COUNT(DISTINCT l.rate) < 5
to get 8 rows:
100, 90, 90, 80, 80, 80, 70, 60
Or change to ON t.id = l.id AND (t.rate < l.rate OR (t.rate = l.rate AND t.pri_key > l.pri_key))
to get 5 rows:
100, 90, 90, 80, 80
In MySQL 8 or later just use the RANK
, DENSE_RANK
or ROW_NUMBER
functions:
SELECT *FROM ( SELECT *, RANK() OVER (PARTITION BY id ORDER BY rate DESC) AS rnk FROM t) AS xWHERE rnk <= 5
For me something like
SUBSTRING_INDEX(group_concat(col_name order by desired_col_order_name), ',', N)
works perfectly. No complicated query.
for example: get top 1 for each group
SELECT *FROM yourtableWHERE id IN (SELECT SUBSTRING_INDEX(GROUP_CONCAT(id ORDER BY rate DESC), ',', 1) id FROM yourtable GROUP BY year)ORDER BY rate DESC;