Is it possible to use pandas.DataFrame.rolling with a step greater than 1?
You can using rolling again, just need a little bit work with you assign index
Here by = 2
by = 2df.loc[df.index[np.arange(len(df))%by==1],'New']=df.Price.rolling(window=4).mean()df Price New0 63 NaN1 92 NaN2 92 NaN3 5 63.004 90 NaN5 3 47.506 81 NaN7 98 68.008 100 NaN9 58 84.2510 38 NaN11 15 52.7512 75 NaN13 19 36.75
So, I know it is a long time since the question was asked, by I bumped into this same problem and when dealing with long time series you really would want to avoid the unnecessary calculation of the values you are not interested at. Since Pandas rolling method does not implement a step
argument, I wrote a workaround using numpy.
It is basically a combination of the solution in this link and the indexing proposed by BENY.
def apply_rolling_data(data, col, function, window, step=1, labels=None): """Perform a rolling window analysis at the column `col` from `data` Given a dataframe `data` with time series, call `function` at sections of length `window` at the data of column `col`. Append the results to `data` at a new columns with name `label`. Parameters ---------- data : DataFrame Data to be analyzed, the dataframe must stores time series columnwise, i.e., each column represent a time series and each row a time index col : str Name of the column from `data` to be analyzed function : callable Function to be called to calculate the rolling window analysis, the function must receive as input an array or pandas series. Its output must be either a number or a pandas series window : int length of the window to perform the analysis step : int step to take between two consecutive windows labels : str Name of the column for the output, if None it defaults to 'MEASURE'. It is only used if `function` outputs a number, if it outputs a Series then each index of the series is going to be used as the names of their respective columns in the output Returns ------- data : DataFrame Input dataframe with added columns with the result of the analysis performed """ x = _strided_app(data[col].to_numpy(), window, step) rolled = np.apply_along_axis(function, 1, x) if labels is None: labels = [f"metric_{i}" for i in range(rolled.shape[1])] for col in labels: data[col] = np.nan data.loc[ data.index[ [False]*(window-1) + list(np.arange(len(data) - (window-1)) % step == 0)], labels] = rolled return datadef _strided_app(a, L, S): # Window len = L, Stride len/stepsize = S """returns an array that is strided """ nrows = ((a.size-L)//S)+1 n = a.strides[0] return np.lib.stride_tricks.as_strided( a, shape=(nrows, L), strides=(S*n, n))