How to downsampling time series data in pandas? How to downsampling time series data in pandas? pandas pandas

How to downsampling time series data in pandas?


You can convert your time series to an actual timedelta, then use resample for a vectorized solution:

t = pd.to_timedelta(df.time, unit='T')s = df.set_index(t).groupby('id').resample('3T').last().reset_index(drop=True)s.assign(time=s.groupby('id').cumcount())

   id  time  value0   1     0      51   1     1     162   1     2     203   2     0      84   2     1     105   4     0      6


Use np.r_ and .iloc with groupby:

df.groupby('id')['value'].apply(lambda x: x.iloc[np.r_[2:len(x):3,-1]])

Output:

id    1   2      5    5     16    7     202   10     8    11    104   13     6Name: value, dtype: int64

Going a little further with column naming etc..

df_out = df.groupby('id')['value']\           .apply(lambda x: x.iloc[np.r_[2:len(x):3,-1]]).reset_index()df_out.assign(time=df_out.groupby('id').cumcount()).drop('level_1', axis=1)

Output:

   id  value  time0   1      5     01   1     16     12   1     20     23   2      8     04   2     10     15   4      6     0