Calculating Pearson correlation and significance in Python

You can have a look at scipy.stats:

from pydoc import helpfrom scipy.stats.stats import pearsonrhelp(pearsonr)>>>Help on function pearsonr in module scipy.stats.stats:pearsonr(x, y) Calculates a Pearson correlation coefficient and the p-value for testing non-correlation. The Pearson correlation coefficient measures the linear relationship between two datasets. Strictly speaking, Pearson's correlation requires that each dataset be normally distributed. Like other correlation coefficients, this one varies between -1 and +1 with 0 implying no correlation. Correlations of -1 or +1 imply an exact linear relationship. Positive correlations imply that as x increases, so does y. Negative correlations imply that as x increases, y decreases. The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Pearson correlation at least as extreme as the one computed from these datasets. The p-values are not entirely reliable but are probably reasonable for datasets larger than 500 or so. Parameters ---------- x : 1D array y : 1D array the same length as x Returns ------- (Pearson's correlation coefficient,  2-tailed p-value) References ---------- http://www.statsoft.com/textbook/glosp.html#Pearson%20Correlation

python numpy statistics scipy

The Pearson correlation can be calculated with numpy's corrcoef.

import numpynumpy.corrcoef(list1, list2)[0, 1]

python numpy statistics scipy

An alternative can be a native scipy function from linregress which calculates:

slope : slope of the regression line
intercept : intercept of the regression line
r-value : correlation coefficient
p-value : two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero
stderr : Standard error of the estimate

And here is an example:

a = [15, 12, 8, 8, 7, 7, 7, 6, 5, 3]b = [10, 25, 17, 11, 13, 17, 20, 13, 9, 15]from scipy.stats import linregresslinregress(a, b)

will return you:

LinregressResult(slope=0.20833333333333337, intercept=13.375, rvalue=0.14499815458068521, pvalue=0.68940144811669501, stderr=0.50261704627083648)

CodeHunter

Calculating Pearson correlation and significance in Python

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last