How to use pandas to find consecutive same data in time series How to use pandas to find consecutive same data in time series python python

How to use pandas to find consecutive same data in time series


This should do it

df['value_grp'] = (df.Values.diff(1) != 0).astype('int').cumsum()

value_grp will increment by one whenever Value changes. Below, you can extract the group results

pd.DataFrame({'BeginDate' : df.groupby('value_grp').Date.first(),               'EndDate' : df.groupby('value_grp').Date.last(),              'Consecutive' : df.groupby('value_grp').size(),               'No' : df.groupby('value_grp').No.first()}).reset_index(drop=True)


Here is an alternative solution:

rslt = (df.assign(Consecutive=df.Value                                .groupby((df.Value != df.Value.shift())                                         .cumsum())                                .transform('size'))          .query('Consecutive > 1')          .groupby('Consecutive')          .agg({'No':{'No':'first'}, 'Date': {'BeginDate':'first', 'EndDate':'last'}})          .reset_index())rslt.columns = [t[1] if t[1] else t[0] for t in rslt.columns]

Demo:

In [225]: %pasterslt = (df.assign(Consecutive=df.Value                                .groupby((df.Value != df.Value.shift())                                         .cumsum())                                .transform('size'))          .query('Consecutive > 1')          .groupby('Consecutive')          .agg({'No':{'No':'first'}, 'Date': {'BeginDate':'first', 'EndDate':'last'}})          .reset_index())rslt.columns = [t[1] if t[1] else t[0] for t in rslt.columns]## -- End pasted text --In [226]: rsltOut[226]:   Consecutive  BeginDate    EndDate      No0            2 1999-11-26 1999-11-29  6000011            4 1999-11-19 1999-11-24  6000002            6 1999-11-10 1999-11-17  600000