pandas: find percentile stats of a given column
You can use the pandas.DataFrame.quantile() function, as shown below.
import pandas as pdimport randomA = [ random.randint(0,100) for i in range(10) ]B = [ random.randint(0,100) for i in range(10) ]df = pd.DataFrame({ 'field_A': A, 'field_B': B })df# field_A field_B# 0 90 72# 1 63 84# 2 11 74# 3 61 66# 4 78 80# 5 67 75# 6 89 47# 7 12 22# 8 43 5# 9 30 64df.field_A.mean() # Same as df['field_A'].mean()# 54.399999999999999df.field_A.median() # 62.0# You can call `quantile(i)` to get the i'th quantile,# where `i` should be a fractional number.df.field_A.quantile(0.1) # 10th percentile# 11.9df.field_A.quantile(0.5) # same as median# 62.0df.field_A.quantile(0.9) # 90th percentile# 89.10000000000001
assume series s
s = pd.Series(np.arange(100))
Get quantiles for [.1, .2, .3, .4, .5, .6, .7, .8, .9]
s.quantile(np.linspace(.1, 1, 9, 0))0.1 9.90.2 19.80.3 29.70.4 39.60.5 49.50.6 59.40.7 69.30.8 79.20.9 89.1dtype: float64
OR
s.quantile(np.linspace(.1, 1, 9, 0), 'lower')0.1 90.2 190.3 290.4 390.5 490.6 590.7 690.8 790.9 89dtype: int32
I figured out below would work:
my_df.dropna().quantile([0.0, .9])