Multiple DISTINCT ON clauses in PostgreSQL Multiple DISTINCT ON clauses in PostgreSQL postgresql postgresql

Multiple DISTINCT ON clauses in PostgreSQL


As commented, there is ambiguity in the question. The number of result rows can differ for every call. If you are satisfied with arbitrary results, @klin's solution is good enough. Else, you need to define requiremens more closely. Like:

  • distinct on (name, birth), pick smallest height first, then smallest ID as tiebreaker
  • distinct on (name, height), pick earliest birth first, then smallest ID as tiebreaker

Your table should have a primary key (or some way to identify rows uniquely):

CREATE TEMP TABLE tbl (  tbl_id serial PRIMARY KEY, name text, birth int, height numeric);INSERT INTO tbl (name, birth, height)VALUES  ('William', 1976, 1.82), ('James',   1981, 1.68), ('Mike',    1976, 1.68), ('Tom',     1967, 1.79), ('William', 1976, 1.74), ('William', 1981, 1.82), ('Tom',     1978, 1.92), ('Mike',    1963, 1.68), ('Tom',     1971, 1.86), ('James',   1981, 1.77), ('Tom',     1971, 1.89);

Query:

SELECT DISTINCT ON (name, height) *FROM  (   SELECT DISTINCT ON (name, birth) *   FROM   tbl   ORDER  BY name, birth, height, tbl_id  -- pick smallest height, ID as tiebreaker   ) subORDER  BY name, height, birth, tbl_id;    -- pick earliest birth, ID as tiebreaker
 tbl_id |  name   | birth | height--------+---------+-------+--------      2 | James   |  1981 |   1.68      8 | Mike    |  1963 |   1.68      4 | Tom     |  1967 |   1.79      9 | Tom     |  1971 |   1.86      7 | Tom     |  1978 |   1.92      5 | William |  1976 |   1.74      6 | William |  1981 |   1.82(7 rows)    -- !!!

A DISTINCT ON query without deterministic ORDER BY can return any arbitrary row from each set of dupes. Applied once, you still get a deterministic number of rows (with arbitrary picks). Applied repeatedly, the resulting number of rows is arbitrary, too. Related:


Use a derived table:

with my_table(name, birth, height) as (values('William',    1976,      1.82),('James',      1981,      1.68),('Mike',       1976,      1.68),('Tom',        1967,      1.79),('William',    1976,      1.74),  -- ? (name, birth)('William',    1981,      1.82),  -- ? (name, height)('Tom',        1978,      1.92),('Mike',       1963,      1.68),  -- ? (name, height)('Tom',        1971,      1.86),('James',      1981,      1.77),  -- ? (name, birth)('Tom',        1971,      1.89)   -- ? (name, birth))select distinct on (name, height) *from (    select distinct on (name, birth) *    from my_table    ) s  name   | birth | height ---------+-------+-------- James   |  1981 |   1.68 Mike    |  1963 |   1.68 Tom     |  1967 |   1.79 Tom     |  1971 |   1.89 Tom     |  1978 |   1.92 William |  1976 |   1.82(6 rows)