Standard deviation of a list Standard deviation of a list python python

Standard deviation of a list


Since Python 3.4 / PEP450 there is a statistics module in the standard library, which has a method stdev for calculating the standard deviation of iterables like yours:

>>> A_rank = [0.8, 0.4, 1.2, 3.7, 2.6, 5.8]>>> import statistics>>> statistics.stdev(A_rank)2.0634114147853952


I would put A_Rank et al into a 2D NumPy array, and then use numpy.mean() and numpy.std() to compute the means and the standard deviations:

In [17]: import numpyIn [18]: arr = numpy.array([A_rank, B_rank, C_rank])In [20]: numpy.mean(arr, axis=0)Out[20]: array([ 0.7       ,  2.2       ,  1.8       ,  2.13333333,  3.36666667,        5.1       ])In [21]: numpy.std(arr, axis=0)Out[21]: array([ 0.45460606,  1.29614814,  1.37355985,  1.50628314,  1.15566239,        1.2083046 ])


Here's some pure-Python code you can use to calculate the mean and standard deviation.

All code below is based on the statistics module in Python 3.4+.

def mean(data):    """Return the sample arithmetic mean of data."""    n = len(data)    if n < 1:        raise ValueError('mean requires at least one data point')    return sum(data)/n # in Python 2 use sum(data)/float(n)def _ss(data):    """Return sum of square deviations of sequence data."""    c = mean(data)    ss = sum((x-c)**2 for x in data)    return ssdef stddev(data, ddof=0):    """Calculates the population standard deviation    by default; specify ddof=1 to compute the sample    standard deviation."""    n = len(data)    if n < 2:        raise ValueError('variance requires at least two data points')    ss = _ss(data)    pvar = ss/(n-ddof)    return pvar**0.5

Note: for improved accuracy when summing floats, the statistics module uses a custom function _sum rather than the built-in sum which I've used in its place.

Now we have for example:

>>> mean([1, 2, 3])2.0>>> stddev([1, 2, 3]) # population standard deviation0.816496580927726>>> stddev([1, 2, 3], ddof=1) # sample standard deviation0.1