MySQL: Average interval between records MySQL: Average interval between records mysql mysql

MySQL: Average interval between records


Intuitively, what you are asking should be equivalent to the interval between the first and last dates, divided by the number of dates minus 1.

Let me explain more thoroughly. Imagine the dates are points on a line (+ are dates present, - are dates missing, the first date is the 12th, and I changed the last date to Dec 24th for illustration purposes):

++----+---+-+

Now, what you really want to do, is evenly space your dates out between these lines, and find how long it is between each of them:

+--+--+--+--+

To do that, you simply take the number of days between the last and first days, in this case 24 - 12 = 12, and divide it by the number of intervals you have to space out, in this case 4: 12 / 4 = 3.

With a MySQL query

SELECT DATEDIFF(MAX(dt), MIN(dt)) / (COUNT(dt) - 1) FROM a;

This works on this table (with your values it returns 2.75):

CREATE TABLE IF NOT EXISTS `a` (  `dt` date NOT NULL) ENGINE=MyISAM DEFAULT CHARSET=latin1;INSERT INTO `a` (`dt`) VALUES('2010-12-12'),('2010-12-13'),('2010-12-18'),('2010-12-22'),('2010-12-24');


If the ids are uniformly incremented without gaps, join the table to itself on id+1:

SELECT d.id, d.date, n.date, datediff(d.date, n.date)FROM dates dJOIN dates n ON(n.id = d.id + 1)

Then GROUP BY and average as needed.

If the ids are not uniform, do an inner query to assign ordered ids first.

I guess you'll also need to add a subquery to get the total number of rows.

Alternatively

Create an aggregate function that keeps track of the previous date, and a running sum and count. You'll still need to select from a subquery to force the ordering by date (actually, I'm not sure if that's guaranteed in MySQL).

Come to think of it, this is a much better way of doing it.

And Even Simpler

Just noting that Vegard's solution is much better.


The following query returns correct result

SELECT AVG(        DATEDIFF(i.date, (SELECT MAX(date)                           FROM intervals WHERE date < i.date)                 )           )FROM intervals i

but it runs a dependent subquery which might be really inefficient with no index and on a larger number of rows.