Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds) Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds) postgresql postgresql

Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds)


You can generate a table of "buckets" by adding intervals created by generate_series(). This SQL statement will generate a table of five-minute buckets for the first day (the value of min(measured_at)) in your data.

select   (select min(measured_at)::date from measurements) + ( n    || ' minutes')::interval start_time,  (select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_timefrom generate_series(0, (24*60), 5) n

Wrap that statement in a common table expression, and you can join and group on it as if it were a base table.

with five_min_intervals as (  select     (select min(measured_at)::date from measurements) + ( n    || ' minutes')::interval start_time,    (select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_time  from generate_series(0, (24*60), 5) n)select f.start_time, f.end_time, avg(m.val) avg_val from measurements mright join five_min_intervals f         on m.measured_at >= f.start_time and m.measured_at < f.end_timegroup by f.start_time, f.end_timeorder by f.start_time

Grouping by an arbitrary number of seconds is similar--use date_trunc().


A more general use of generate_series() lets you avoid guessing the upper limit for five-minute buckets. In practice, you'd probably build this as a view or a function. You might get better performance from a base table.

select   (select min(measured_at)::date from measurements) + ( n    || ' minutes')::interval start_time,  (select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_timefrom generate_series(0, ((select max(measured_at)::date - min(measured_at)::date from measurements) + 1)*24*60, 5) n;


Catcall has a great answer. My example of using it demonstrates having fixed buckets - in this case 30 minute intervals starting at midnight. It also shows that there can be one extra bucket generated in Catcall's first version and how to eliminate it. I wanted exactly 48 buckets in a day. In my problem, observations have separate date and time columns and I want to average the observations within a 30 minute period across the month for a number of different services.

with intervals as (    select        (n||' minutes')::interval as start_time,         ((n+30)|| ' minutes')::interval as end_time    from generate_series(0, (23*60+30), 30) n)select i.start_time, o.service, avg(o.o)fromobservations o right join intervals ion o.time >= i.start_time and o.time < i.end_timewhere o.date between '2013-01-01' and '2013-01-31'group by i.start_time, i.end_time, o.serviceorder by i.start_time


How about

SELECT MIN(val), EXTRACT(epoch FROM measured_at) / EXTRACT(epoch FROM INTERVAL '5 min') AS int FROM measurements GROUP BY int

where '5 min' can be any expression supported by INTERVAL