Count distinct values with OVER(PARTITION BY id) Count distinct values with OVER(PARTITION BY id) postgresql postgresql

Count distinct values with OVER(PARTITION BY id)


No, as the error message states, DISTINCT is not implemented with windows functions. Aplying info from this link into your case you could use something like:

WITH uniques AS ( SELECT congestion.id_element, COUNT(DISTINCT congestion.week_nb) AS unique_references FROM congestionWHERE congestion.date >= '2014.01.01'AND congestion.date <= '2014.12.31' GROUP BY congestion.id_element)SELECT congestion.date, congestion.week_nb, congestion.id_congestion,   congestion.id_element,ROW_NUMBER() OVER(    PARTITION BY congestion.id_element    ORDER BY congestion.date),uniques.unique_references AS week_countFROM congestionJOIN uniques USING (id_element)WHERE congestion.date >= '2014.01.01'AND congestion.date <= '2014.12.31'ORDER BY id_element, date

Depending on the situation you could also put a subquery straight into SELECT-list:

SELECT congestion.date, congestion.week_nb, congestion.id_congestion,   congestion.id_element,ROW_NUMBER() OVER(    PARTITION BY congestion.id_element    ORDER BY congestion.date),(SELECT COUNT(DISTINCT dist_con.week_nb)    FROM congestion AS dist_con    WHERE dist_con.date >= '2014.01.01'    AND dist_con.date <= '2014.12.31'    AND dist_con.id_element = congestion.id_element) AS week_countFROM congestionWHERE congestion.date >= '2014.01.01'AND congestion.date <= '2014.12.31'ORDER BY id_element, date


I find that the easiest way is to use a subquery/CTE and conditional aggregation:

SELECT c.date, c.week_nb, c.id_congestion, c.id_element,       ROW_NUMBER() OVER (PARTITION BY c.id_element ORDER BY c.date),       (CASE WHEN seqnum = 1 THEN 1 ELSE 0 END) as week_countFROM (SELECT c.*,             ROW_NUMBER() OVER (PARTITION BY c.congestion.id_element, c.week_nb                                ORDER BY c.date) as seqnum      FROM congestion c     ) cWHERE c.date >= '2014.01.01' AND c.date <= '2014.12.31'ORDER BY id_element, date


Make partitioned set smaller, up to the point there is no duplicates over counted field :

SELECT congestion.date, congestion.week_nb, congestion.id_congestion,   congestion.id_element,ROW_NUMBER() OVER(    PARTITION BY congestion.id_element    ORDER BY congestion.date),COUNT(congestion.week_nb) -- remove distinct OVER(    PARTITION BY congestion.id_element,                 -- add new fields which will restart counter in case duplication                 congestion.id_congestion) AS week_countFROM congestionWHERE congestion.date >= '2014.01.01'AND congestion.date <= '2014.12.31'ORDER BY id_element, date