Pandas Pivot tables row subtotals Pandas Pivot tables row subtotals python python

Pandas Pivot tables row subtotals


If you put State and City not both in the rows, you'll get separate margins. Reshape and you get the table you're after:

In [10]: table = pivot_table(df, values=['SalesToday', 'SalesMTD','SalesYTD'],\                     rows=['State'], cols=['City'], aggfunc=np.sum, margins=True)In [11]: table.stack('City')Out[11]:             SalesMTD  SalesToday  SalesYTDState City                                stA   All        900          50      2100      ctA        400          20      1000      ctB        500          30      1100stB   All        700          50      2200      ctC        500          10       900      ctD        200          40      1300stC   All        300          30       800      ctF        300          30       800All   All       1900         130      5100      ctA        400          20      1000      ctB        500          30      1100      ctC        500          10       900      ctD        200          40      1300      ctF        300          30       800

I admit this isn't totally obvious.


You can get the summarized values by using groupby() on the State column.

Lets make some sample data first:

import pandas as pdimport StringIOincsv = StringIO.StringIO("""Date,State,City,SalesToday,SalesMTD,SalesYTD20130320,stA,ctA,20,400,100020130320,stA,ctB,30,500,110020130320,stB,ctC,10,500,90020130320,stB,ctD,40,200,130020130320,stC,ctF,30,300,800""")df = pd.read_csv(incsv, index_col=['Date'], parse_dates=True)

Then apply the groupby function and add a column City:

dfsum = df.groupby('State', as_index=False).sum()dfsum['City'] = 'All'print dfsum  State  SalesToday  SalesMTD  SalesYTD City0   stA          50       900      2100  All1   stB          50       700      2200  All2   stC          30       300       800  All

We can append the original data to the summed df by using append:

dfsum.append(df).set_index(['State','City']).sort_index()print dfsum            SalesMTD  SalesToday  SalesYTDState City                                stA   All        900          50      2100      ctA        400          20      1000      ctB        500          30      1100stB   All        700          50      2200      ctC        500          10       900      ctD        200          40      1300stC   All        300          30       800      ctF        300          30       800

I added the set_index and sort_index to make it look more like your example output, its not strictly necessary to get the results.


I Think this subtotal example code is what you want(similar to excel subtotal)

I assume that you want group by columns A, B, C, D, than count column value of E

main_df.groupby(['A', 'B', 'C']).apply(lambda sub_df: sub_df\       .pivot_table(index=['D'], values=['E'], aggfunc='count', margins=True)

output:

A B C  D  E       a  1 a a a  b  2       c  2     all  5       a  3 b b a  b  2       c  2     all  7       a  3 b b b  b  6       c  2       d  3     all 14