Why does corrcoef return a matrix? Why does corrcoef return a matrix? numpy numpy

Why does corrcoef return a matrix?


It allows you to compute correlation coefficients of >2 data sets, e.g.

>>> from numpy import *>>> a = array([1,2,3,4,6,7,8,9])>>> b = array([2,4,6,8,10,12,13,15])>>> c = array([-1,-2,-2,-3,-4,-6,-7,-8])>>> corrcoef([a,b,c])array([[ 1.        ,  0.99535001, -0.9805214 ],       [ 0.99535001,  1.        , -0.97172394],       [-0.9805214 , -0.97172394,  1.        ]])

Here we can get the correlation coefficient of a,b (0.995), a,c (-0.981) and b,c (-0.972) at once. The two-data-set case is just a special case of N-data-set class. And probably it's better to keep the same return type. Since the "one value" can be obtained simply with

>>> corrcoef(a,b)[1,0]0.99535001355530017

there's no big reason to create the special case.


corrcoef returns the normalised covariance matrix.

The covariance matrix is the matrix

Cov( X, X )    Cov( X, Y )Cov( Y, X )    Cov( Y, Y )

Normalised, this will yield the matrix:

Corr( X, X )    Corr( X, Y )Corr( Y, X )    Corr( Y, Y )

correlation1[0, 0 ] is the correlation between Strategy1Returns and itself, which must be 1. You just want correlation1[ 0, 1 ].


The correlation matrix is the standard way to express correlations between an arbitrary finite number of variables. The correlation matrix of N data vectors is a symmetric N × N matrix with unity diagonal. Only in the case N = 2 does this matrix have one free parameter.