Iterating over groups (Python pandas dataframe) Iterating over groups (Python pandas dataframe) pandas pandas

Iterating over groups (Python pandas dataframe)


The .groupby() object has a .groups attribute that returns a Python dict of indices. In this case:

In [26]: df = pd.DataFrame({'A': ['foo', 'bar'] * 3,   ....:                    'B': ['me', 'you', 'me'] * 2,   ....:                    'C': [5, 2, 3, 4, 6, 9]})In [27]: groups = df.groupby('A')In [28]: groups.groupsOut[28]: {'bar': [1L, 3L, 5L], 'foo': [0L, 2L, 4L]}

You can iterate over this as follows:

keys = groups.groups.keys()for index in range(0, len(keys) - 1):    g1 = df.ix[groups.groups[keys[index]]]    g2 = df.ix[groups.groups[keys[index + 1]]]    # Do something with g1, g2

However, please remember that using for loops to iterate over Pandas objects is generally slower than vector operations. Depending on what you need done, and if it needs to be fast, you may want to try other approaches.


since dict_keys in py3 are not subscriptable, change:

df.ix[groups.groups[keys[index]]]

to

df.ix[groups.groups[list(keys)[index]]]