How to access pandas groupby dataframe by key
You can use the get_group
method:
In [21]: gb.get_group('foo')Out[21]: A B C0 foo 1.624345 52 foo -0.528172 114 foo 0.865408 14
Note: This doesn't require creating an intermediary dictionary / copy of every subdataframe for every group, so will be much more memory-efficient than creating the naive dictionary with dict(iter(gb))
. This is because it uses data-structures already available in the groupby object.
You can select different columns using the groupby slicing:
In [22]: gb[["A", "B"]].get_group("foo")Out[22]: A B0 foo 1.6243452 foo -0.5281724 foo 0.865408In [23]: gb["C"].get_group("foo")Out[23]:0 52 114 14Name: C, dtype: int64
Wes McKinney (pandas' author) in Python for Data Analysis provides the following recipe:
groups = dict(list(gb))
which returns a dictionary whose keys are your group labels and whose values are DataFrames, i.e.
groups['foo']
will yield what you are looking for:
A B C0 foo 1.624345 52 foo -0.528172 114 foo 0.865408 14
Rather than
gb.get_group('foo')
I prefer using gb.groups
df.loc[gb.groups['foo']]
Because in this way you can choose multiple columns as well. for example:
df.loc[gb.groups['foo'],('A','B')]