How to count consecutive ordered values on pandas data frame How to count consecutive ordered values on pandas data frame pandas pandas

How to count consecutive ordered values on pandas data frame


Here is one way we need to create the additional key for groupby then , just need groupby this key and id

s=df.groupby('id').value.apply(lambda x : x.ne(0).cumsum())df[df.value==0].groupby([df.id,s]).size().max(level=0).reindex(df.id.unique(),fill_value=0)Out[267]: id354    3357    2540    0dtype: int64


Create groupID m for consecutive rows of same value. Next, groupby on id and m and call value_counts, and .loc on multiindex to slice only 0 value of the right-most index level. Finally, filter out duplicates index by duplicated in id and reindex to create 0 value for id having no 0 count

m = df.value.diff().ne(0).cumsum().rename('gid')    #Consecutive rows having the same value will be assigned same IDNumber by this command. #It is the way to identify a group of consecutive rows having the same value, so I called it groupID.df1 = df.groupby(['id', m]).value.value_counts().loc[:,:,0].droplevel(-1)#this groupby groups consecutive rows of same value per ID into separate groups.#within each group, count number of each value and `.loc` to pick specifically only `0` because we only concern on the count of value `0`.df1[~df1.index.duplicated()].reindex(df.id.unique(), fill_value=0)#There're several groups of value `0` per `id`. We want only group of highest count. #`value_count` already sorted number of count descending, so we just need to pick #the top one of duplicates by slicing on True/False mask of `duplicated`.#finally, `reindex` adding any `id` doesn't have value 0 in original `df`.#Note: `id` is the column `id` in `df`. It is different from groupID `m` we create to use with groupbyOut[315]:id354    3357    2540    0Name: value, dtype: int64


you could do :

df.groupby('id').value.apply(lambda x : ((x.diff() !=0).cumsum()).where(x ==0,\                                       np.nan).value_counts().max()).fillna(0)

Output

id354    3.0357    2.0540    0.0Name: value, dtype: float64