Estimate Autocorrelation using Python
I don't think there is a NumPy function for this particular calculation. Here is how I would write it:
def estimated_autocorrelation(x): """ http://stackoverflow.com/q/14297012/190597 http://en.wikipedia.org/wiki/Autocorrelation#Estimation """ n = len(x) variance = x.var() x = x-x.mean() r = np.correlate(x, x, mode = 'full')[-n:] assert np.allclose(r, np.array([(x[:n-k]*x[-(n-k):]).sum() for k in range(n)])) result = r/(variance*(np.arange(n, 0, -1))) return result
The assert statement is there to both check the calculation and to document its intent.
When you are confident this function is behaving as expected, you can comment-out the assert
statement, or run your script with python -O
. (The -O
flag tells Python to ignore assert statements.)
I took a part of code from pandas autocorrelation_plot() function. I checked the answers with R and the values are matching exactly.
import numpydef acf(series): n = len(series) data = numpy.asarray(series) mean = numpy.mean(data) c0 = numpy.sum((data - mean) ** 2) / float(n) def r(h): acf_lag = ((data[:n - h] - mean) * (data[h:] - mean)).sum() / float(n) / c0 return round(acf_lag, 3) x = numpy.arange(n) # Avoiding lag 0 calculation acf_coeffs = map(r, x) return acf_coeffs
The statsmodels package adds a autocorrelation function that internally uses np.correlate
(according to the statsmodels
documentation).