Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds)

postgresql group-by

You can generate a table of "buckets" by adding intervals created by generate_series(). This SQL statement will generate a table of five-minute buckets for the first day (the value of min(measured_at)) in your data.

select   (select min(measured_at)::date from measurements) + ( n    || ' minutes')::interval start_time,  (select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_timefrom generate_series(0, (24*60), 5) n

Wrap that statement in a common table expression, and you can join and group on it as if it were a base table.

with five_min_intervals as (  select     (select min(measured_at)::date from measurements) + ( n    || ' minutes')::interval start_time,    (select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_time  from generate_series(0, (24*60), 5) n)select f.start_time, f.end_time, avg(m.val) avg_val from measurements mright join five_min_intervals f         on m.measured_at >= f.start_time and m.measured_at < f.end_timegroup by f.start_time, f.end_timeorder by f.start_time

Grouping by an arbitrary number of seconds is similar--use date_trunc().

A more general use of generate_series() lets you avoid guessing the upper limit for five-minute buckets. In practice, you'd probably build this as a view or a function. You might get better performance from a base table.

select   (select min(measured_at)::date from measurements) + ( n    || ' minutes')::interval start_time,  (select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_timefrom generate_series(0, ((select max(measured_at)::date - min(measured_at)::date from measurements) + 1)*24*60, 5) n;

postgresql group-by

Catcall has a great answer. My example of using it demonstrates having fixed buckets - in this case 30 minute intervals starting at midnight. It also shows that there can be one extra bucket generated in Catcall's first version and how to eliminate it. I wanted exactly 48 buckets in a day. In my problem, observations have separate date and time columns and I want to average the observations within a 30 minute period across the month for a number of different services.

with intervals as (    select        (n||' minutes')::interval as start_time,         ((n+30)|| ' minutes')::interval as end_time    from generate_series(0, (23*60+30), 30) n)select i.start_time, o.service, avg(o.o)fromobservations o right join intervals ion o.time >= i.start_time and o.time < i.end_timewhere o.date between '2013-01-01' and '2013-01-31'group by i.start_time, i.end_time, o.serviceorder by i.start_time

postgresql group-by

How about

SELECT MIN(val), EXTRACT(epoch FROM measured_at) / EXTRACT(epoch FROM INTERVAL '5 min') AS int FROM measurements GROUP BY int

where '5 min' can be any expression supported by INTERVAL

CodeHunter

Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds)

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last