How do I calculate percentiles with python/numpy? How do I calculate percentiles with python/numpy? numpy numpy

How do I calculate percentiles with python/numpy?


You might be interested in the SciPy Stats package. It has the percentile function you're after and many other statistical goodies.

percentile() is available in numpy too.

import numpy as npa = np.array([1,2,3,4,5])p = np.percentile(a, 50) # return 50th percentile, e.g median.print p3.0

This ticket leads me to believe they won't be integrating percentile() into numpy anytime soon.


By the way, there is a pure-Python implementation of percentile function, in case one doesn't want to depend on scipy. The function is copied below:

## {{{ http://code.activestate.com/recipes/511478/ (r1)import mathimport functoolsdef percentile(N, percent, key=lambda x:x):    """    Find the percentile of a list of values.    @parameter N - is a list of values. Note N MUST BE already sorted.    @parameter percent - a float value from 0.0 to 1.0.    @parameter key - optional key function to compute value from each element of N.    @return - the percentile of the values    """    if not N:        return None    k = (len(N)-1) * percent    f = math.floor(k)    c = math.ceil(k)    if f == c:        return key(N[int(k)])    d0 = key(N[int(f)]) * (c-k)    d1 = key(N[int(c)]) * (k-f)    return d0+d1# median is 50th percentile.median = functools.partial(percentile, percent=0.5)## end of http://code.activestate.com/recipes/511478/ }}}


import numpy as npa = [154, 400, 1124, 82, 94, 108]print np.percentile(a,95) # gives the 95th percentile