Better binning in pandas [duplicate] Better binning in pandas [duplicate] python python

Better binning in pandas [duplicate]


Perhaps you are looking for pandas.cut:

import pandas as pdimport numpy as npdf = pd.DataFrame(np.arange(50), columns=['filtercol'])filter_values = [0, 5, 17, 33]   out = pd.cut(df.filtercol, bins=filter_values)counts = pd.value_counts(out)# counts is a Seriesprint(counts)

yields

(17, 33]    16(5, 17]     12(0, 5]       5

To reorder the result so the bin ranges appear in order, you could use

counts.sort_index()

which yields

(0, 5]       5(5, 17]     12(17, 33]    16

Thanks to nivniv and InLaw for this improvement.


See also Discretization and quantiling.