How to predict time series in scikit-learn? How to predict time series in scikit-learn? python python

How to predict time series in scikit-learn?


According to Wikipedia, EWMA works well with stationary data, but it does not work as expected in the presence of trends, or seasonality. In those cases you should use a second or third order EWMA method, respectively. I decided to look at the pandas ewma function to see how it handled trends, and this is what I came up with:

import pandas, numpy as npewma = pandas.stats.moments.ewma# make a hat function, and add noisex = np.linspace(0,1,100)x = np.hstack((x,x[::-1]))x += np.random.normal( loc=0, scale=0.1, size=200 )plot( x, alpha=0.4, label='Raw' )# take EWMA in both directions with a smaller span termfwd = ewma( x, span=15 )          # take EWMA in fwd directionbwd = ewma( x[::-1], span=15 )    # take EWMA in bwd directionc = np.vstack(( fwd, bwd[::-1] )) # lump fwd and bwd togetherc = np.mean( c, axis=0 )          # average  # regular EWMA, with bias against trendplot( ewma( x, span=20 ), 'b', label='EWMA, span=20' )# "corrected" (?) EWMAplot( c, 'r', label='Reversed-Recombined' )legend(loc=8)savefig( 'ewma_correction.png', fmt='png', dpi=100 )

enter image description here

As you can see, the EWMA bucks the trend uphill and downhill. We can correct for this (without having to implement a second-order scheme ourselves) by taking the EWMA in both directions and then averaging. I hope your data was stationary!


This might be what you're looking for, with regard to the exponentially weighted moving average:

import pandas, numpyewma = pandas.stats.moments.ewmaEMOV_n = ewma( ys, com=2 )

Here, com is a parameter that you can read about here. Then you can combine EMOV_n to Xs, using something like:

Xs = numpy.vstack((Xs,EMOV_n))

And then you can look at various linear models, here, and do something like:

from sklearn import linear_modelclf = linear_model.LinearRegression()clf.fit ( Xs, ys )print clf.coef_

Best of luck!