How to move pandas data from index to column after multiple groupby
Method #1: reset_index()
>>> g uses books sum sumtoken year xanthos 1830 3 3 1840 3 3 1868 2 2 1875 1 1[4 rows x 2 columns]>>> g = g.reset_index()>>> g token year uses books sum sum0 xanthos 1830 3 31 xanthos 1840 3 32 xanthos 1868 2 23 xanthos 1875 1 1[4 rows x 4 columns]
Method #2: don't make the index in the first place, using as_index=False
>>> g = dfalph[['token', 'year', 'uses', 'books']].groupby(['token', 'year'], as_index=False).sum()>>> g token year uses books0 xanthos 1830 3 31 xanthos 1840 3 32 xanthos 1868 2 23 xanthos 1875 1 1[4 rows x 4 columns]
I defer form the accepted answer.While there are 2 ways to do this, these will not necessarily result in same output. Specially when you are using Grouper
in groupby
index=False
reset_index()
example df
+---------+---------+-------------+------------+| column1 | column2 | column_date | column_sum |+---------+---------+-------------+------------+| A | M | 26-10-2018 | 2 || B | M | 28-10-2018 | 3 || A | M | 30-10-2018 | 6 || B | M | 01-11-2018 | 3 || C | N | 03-11-2018 | 4 |+---------+---------+-------------+------------+
They do not work the same way.
df = df.groupby( by=[ 'column1', 'column2', pd.Grouper(key='column_date', freq='M') ], as_index=False).sum()
The above will give
+---------+---------+------------+| column1 | column2 | column_sum |+---------+---------+------------+| A | M | 8 || B | M | 3 || B | M | 3 || C | N | 4 |+---------+---------+------------+
While,
df = df.groupby( by=[ 'column1', 'column2', pd.Grouper(key='column_date', freq='M') ]).sum().reset_index()
Will give
+---------+---------+-------------+------------+| column1 | column2 | column_date | column_sum |+---------+---------+-------------+------------+| A | M | 31-10-2018 | 8 || B | M | 31-10-2018 | 3 || B | M | 30-11-2018 | 3 || C | N | 30-11-2018 | 4 |+---------+---------+-------------+------------+
You need to add drop=True
:
df.reset_index(drop=True)df = df.groupby( by=[ 'column1', 'column2', pd.Grouper(key='column_date', freq='M') ]).sum().reset_index(drop=True)