Compute percentile rank relative to a given population
Setup:
In [62]: v=np.random.rand(100)In [63]: x=np.array([0.3, 0.4, 0.7])
Using Numpy broadcasting:
In [64]: (v<x[:,None]).mean(axis=1)Out[64]: array([ 0.18, 0.28, 0.6 ])
Check:
In [67]: percentile_rank(0.3)Out[67]: 0.17999999999999999In [68]: percentile_rank(0.4)Out[68]: 0.28000000000000003In [69]: percentile_rank(0.7)Out[69]: 0.59999999999999998
I think pd.cut
can do that
s=pd.Series([-np.inf,0.3, 0.5, 0.7])pd.cut(v,s,right=False).value_counts().cumsum()/len(v)Out[702]: [-inf, 0.3) 0.37[0.3, 0.5) 0.54[0.5, 0.7) 0.71dtype: float64
Result from your function
np.vectorize(percentile_rank)(np.array([0.3, 0.5, 0.7]))Out[696]: array([0.37, 0.54, 0.71])