Python pandas groupby aggregate on multiple columns, then pivot Python pandas groupby aggregate on multiple columns, then pivot python python

Python pandas groupby aggregate on multiple columns, then pivot


Edited for Pandas 0.22+ considering the deprecation of the use of dictionaries in a group by aggregation.

We set up a very similar dictionary where we use the keys of the dictionary to specify our functions and the dictionary itself to rename the columns.

rnm_cols = dict(size='Size', sum='Sum', mean='Mean', std='Std')df.set_index(['Category', 'Item']).stack().groupby('Category') \  .agg(rnm_cols.keys()).rename(columns=rnm_cols)            Size   Sum        Mean        StdCategory                                     Books          3    58   19.333333   2.081666Clothes        3   148   49.333333   4.041452Technology     6  1800  300.000000  70.710678

option 1
use agg ← link to docs

agg_funcs = dict(Size='size', Sum='sum', Mean='mean', Std='std')df.set_index(['Category', 'Item']).stack().groupby(level=0).agg(agg_funcs)                  Std   Sum        Mean  SizeCategory                                     Books        2.081666    58   19.333333     3Clothes      4.041452   148   49.333333     3Technology  70.710678  1800  300.000000     6

option 2
more for less
use describe ← link to docs

df.set_index(['Category', 'Item']).stack().groupby(level=0).describe().unstack()            count        mean        std    min    25%    50%    75%    maxCategory                                                                   Books         3.0   19.333333   2.081666   17.0   18.5   20.0   20.5   21.0Clothes       3.0   49.333333   4.041452   45.0   47.5   50.0   51.5   53.0Technology    6.0  300.000000  70.710678  200.0  262.5  300.0  337.5  400.0


df.groupby('Category').agg({'Item':'size','shop1':['sum','mean','std'],'shop2':['sum','mean','std'],'shop3':['sum','mean','std']})

Or if you want it across all shops then:

df1 = df.set_index(['Item','Category']).stack().reset_index().rename(columns={'level_2':'Shops',0:'costs'})df1.groupby('Category').agg({'Item':'size','costs':['sum','mean','std']})


If I understand correctly, you want to calculate aggregate metrics for all shops, not for each individually. To do that, you can first stack your dataframe and then group by Category:

stacked = df.set_index(['Item', 'Category']).stack().reset_index()stacked.columns = ['Item', 'Category', 'Shop', 'Price']stacked.groupby('Category').agg({'Price':['count','sum','mean','std']})

Which results in

           Price                                        count   sum        mean        stdCategory                                     Books          3    58   19.333333   2.081666Clothes        3   148   49.333333   4.041452Technology     6  1800  300.000000  70.710678