What is the difference between MATLAB/Octave corr and Python numpy.correlate?
It appears that there exists a numpy.corrcoef
which computes the correlation coefficients, as desired. However, its interface is different from the Octave/Matlab corr
.
First of all, by default, the function treats rows as variables, with the columns being observations. To mimic the behavior of Octave/Matlab, you can pass a flag which reverses this.
Also, according to this answer, the numpy.cov
function (which corrcoef
uses internally, I assume) returns a 2x2 matrix, each of which contain a specific covariance:
cov(a,a) cov(a,b)cov(a,b) cov(b,b)
As he points out, the [0][1]
element is what you'd want for cov(a,b)
. Thus, perhaps something like this will work:
for i in range(25): c2[i] = numpy.corrcoef(a[:,i], b, rowvar=0)[0][1]
For reference, here are some excerpts of the two functions that you had tried. It seems to be that they perform completely different things.
Octave:
— Function File: corr (x, y)
Compute matrix of correlation coefficients.
If each row of x and y is an observation and each column is a variable, then the (i, j)-th entry of corr (x, y) is the correlation between the i-th variable in x and the j-th variable in y.
corr (x,y) = cov (x,y) / (std (x) * std (y))
If called with one argument, compute corr (x, x), the correlation between the columns of x.
And Numpy:
numpy.correlate(a, v, mode='valid', old_behavior=False)[source]
Cross-correlation of two 1-dimensional sequences.
This function computes the correlation as generally defined in signal processing texts:
z[k] = sum_n a[n] * conj(v[n+k])
with a and v sequences being zero-padded where necessary and conj being the conjugate.