pandas rolling max with groupby pandas rolling max with groupby python-3.x python-3.x

pandas rolling max with groupby


It looks like you need cummax() instead of .rolling(N).max()

In [29]: df['new'] = df.groupby('id').value.cummax()In [30]: dfOut[30]:   id  value  new0   1      3    31   1      6    62   1      3    63   2      2    24   2      1    2

Timing (using brand new Pandas version 0.20.1):

In [3]: df = pd.concat([df] * 10**4, ignore_index=True)In [4]: df.shapeOut[4]: (50000, 2)In [5]: %timeit df.groupby('id').value.apply(lambda x: x.cummax())100 loops, best of 3: 15.8 ms per loopIn [6]: %timeit df.groupby('id').value.cummax()100 loops, best of 3: 4.09 ms per loop

NOTE: from Pandas 0.20.0 what's new


Using apply will be a tiny bit faster:

# Using apply  df['output'] = df.groupby('id').value.apply(lambda x: x.cummax())%timeit df['output'] = df.groupby('id').value.apply(lambda x: x.cummax())1000 loops, best of 3: 1.57 ms per loop

Other method:

df['output'] = df.groupby('id').value.cummax()%timeit df['output'] = df.groupby('id').value.cummax()1000 loops, best of 3: 1.66 ms per loop