How do I limit the number of rows per field value in SQL?
Unluckily mysql doesn't have Analytical Functions. So you have to play with variables.Supposing you have an autoincrement field:
mysql> create table mytab ( -> id int not null auto_increment primary key, -> first_column int, -> second_column int -> ) engine = myisam;Query OK, 0 rows affected (0.05 sec)mysql> insert into mytab (first_column,second_column) -> values -> (1,1),(1,4),(2,10),(3,4),(1,4),(2,5),(1,6);Query OK, 7 rows affected (0.00 sec)Records: 7 Duplicates: 0 Warnings: 0mysql> select * from mytab order by id;+----+--------------+---------------+| id | first_column | second_column |+----+--------------+---------------+| 1 | 1 | 1 || 2 | 1 | 4 || 3 | 2 | 10 || 4 | 3 | 4 || 5 | 1 | 4 || 6 | 2 | 5 || 7 | 1 | 6 |+----+--------------+---------------+7 rows in set (0.00 sec)mysql> select -> id, -> first_column, -> second_column, -> row_num -> from ( -> select *, -> @num := if(@first_column = first_column, @num:= @num + 1, 1) as row_num, -> @first_column:=first_column as c -> from mytab order by first_column,id) as t,(select @first_column:='',@num:=0) as r;+----+--------------+---------------+---------+| id | first_column | second_column | row_num |+----+--------------+---------------+---------+| 1 | 1 | 1 | 1 || 2 | 1 | 4 | 2 || 5 | 1 | 4 | 3 || 7 | 1 | 6 | 4 || 3 | 2 | 10 | 1 || 6 | 2 | 5 | 2 || 4 | 3 | 4 | 1 |+----+--------------+---------------+---------+7 rows in set (0.00 sec)mysql> select -> id, -> first_column, -> second_column, -> row_num -> from ( -> select *, -> @num := if(@first_column = first_column, @num:= @num + 1, 1) as row_num, -> @first_column:=first_column as c -> from mytab order by first_column,id) as t,(select @first_column:='',@num:=0) as r -> having row_num<=2;+----+--------------+---------------+---------+| id | first_column | second_column | row_num |+----+--------------+---------------+---------+| 1 | 1 | 1 | 1 || 2 | 1 | 4 | 2 || 3 | 2 | 10 | 1 || 6 | 2 | 5 | 2 || 4 | 3 | 4 | 1 |+----+--------------+---------------+---------+5 rows in set (0.02 sec)
A Hive solution would be
SELECT S.col1, S.col2FROM (SELECT col1, col2, row_number() over (partition by col1) as r FROM mytable) SWHERE S.r < 3