numpy and statsmodels give different values when calculating correlations, How to interpret this? numpy and statsmodels give different values when calculating correlations, How to interpret this? numpy numpy

numpy and statsmodels give different values when calculating correlations, How to interpret this?


statsmodels.tsa.stattools.ccf is based on np.correlate but does some additional things to give the correlation in the statistical sense instead of the signal processing sense, see cross-correlation on Wikipedia. What happens exactly you can see in the source code, it's very simple.

For easier reference I copied the relevant lines below:

def ccovf(x, y, unbiased=True, demean=True):    n = len(x)    if demean:        xo = x - x.mean()        yo = y - y.mean()    else:        xo = x        yo = y    if unbiased:        xi = np.ones(n)        d = np.correlate(xi, xi, 'full')    else:        d = n    return (np.correlate(xo, yo, 'full') / d)[n - 1:]def ccf(x, y, unbiased=True):    cvf = ccovf(x, y, unbiased=unbiased, demean=True)    return cvf / (np.std(x) * np.std(y))