Pivot Tables or Group By for Pandas? Pivot Tables or Group By for Pandas? python python

Pivot Tables or Group By for Pandas?


Here are couple of ways to reshape your data df

In [27]: dfOut[27]:     Col X  Col Y0  class 1  cat 11  class 2  cat 12  class 3  cat 23  class 2  cat 3

1) Using pd.crosstab()

In [28]: pd.crosstab(df['Col X'], df['Col Y'])Out[28]:Col Y    cat 1  cat 2  cat 3Col Xclass 1      1      0      0class 2      1      0      1class 3      0      1      0

2) Or, use groupby on 'Col X','Col Y' with unstack over Col Y, then fill NaNs with zeros.

In [29]: df.groupby(['Col X','Col Y']).size().unstack('Col Y', fill_value=0)Out[29]:Col Y    cat 1  cat 2  cat 3Col Xclass 1      1      0      0class 2      1      0      1class 3      0      1      0

3) Or, use pd.pivot_table() with index=Col X, columns=Col Y

In [30]: pd.pivot_table(df, index=['Col X'], columns=['Col Y'], aggfunc=len, fill_value=0)Out[30]:Col Y    cat 1  cat 2  cat 3Col Xclass 1      1      0      0class 2      1      0      1class 3      0      1      0

4) Or, use set_index with unstack

In [492]: df.assign(v=1).set_index(['Col X', 'Col Y'])['v'].unstack(fill_value=0)Out[492]:Col Y    cat 1  cat 2  cat 3Col Xclass 1      1      0      0class 2      1      0      1class 3      0      1      0