Using Holt-Winters for forecasting in Python Using Holt-Winters for forecasting in Python python python

Using Holt-Winters for forecasting in Python


I tried generating random data until I got interesting results. Here I fed in all positive numbers and got negative forecasts:

y = [0.92, 0.78, 0.92, 0.61, 0.47, 0.4, 0.59, 0.13, 0.27, 0.31, 0.24, 0.01]holtwinters(y, 0.2, 0.1, 0.05, 4)...forecast: -0.104857182966forecast: -0.197407475203forecast: -0.463988558577forecast: -0.258023593197

but note that the forecast fits the negative slope of the data.

This might be the orders of magnitude you were talking about:

y = [0.1, 0.68, 0.15, 0.08, 0.94, 0.58, 0.35, 0.38, 0.7, 0.74, 0.93, 0.87]holtwinters(y, 0.2, 0.1, 0.05, 4)...forecast: 1.93777559066forecast: 3.11109138055forecast: 0.910967977635forecast: 0.684668348397

But I'm not sure how you'd deem it wildly inaccurate or judge that it "should be" lower.


Whenever you're extrapolating data, you're going to have somewhat surprising results. Are you concerned more that the implementation might be incorrect or that the output doesn't have good properties for your specific usage?


Firstable, if you're unsure about your specific implementation of the algorithm, I recommend that you create some testcase for that. Take another implementation, maybe matlab, whatever, anything that you know it works. Generate some inputs, feed it to the reference and to your implementation, and it should be identical. I have translated and verified some algorithms from matlab that way. scipy.io.loadmat is great for that.

About your usage of the algorithm: You're talking about periodicities in days and weeks, and you feed data on a minutes timescale. I don't know if this specific algorithm handles that well, but in any case I'd suggest, to try some lowpass filtering and then feeding it into the algorithm hourly, or even slower. Nearly 700 timesteps for one period could be just too much to recognize. The data that you feed in should also contain a minimum two complete periods of your timeseries. If your algorithm supports periodicity, you also have to provide it with data in an appropriate way, so it can actually see the periodicity. The fact, that you get these extrem values could be a hint, that the algorithm only has date for a steady trend in one direction.

Maybe you also want to separate your forcasts to have one optimized for weekly prediction, and the other one intraday, and you combine them in the end again.


I think the problem with this method is how they calculate the initial values. They seems to be using a linear model when:

This is a very poor method that should not be used as the trend will be biased by the sea­sonal pat­tern. Imag­ine a sea­sonal pat­tern, for exam­ple, where the last period of the year is always the largest value for the year. Then the trend will be biased upwards. Unfor­tu­nately, Bow­er­man, O’Connell and Koehler (2005) are not alone in rec­om­mend­ing bad meth­ods. I’ve seen sim­i­lar, and worse, pro­ce­dures rec­om­mended in other books. [1]

a better method si decomposinf the timeseries in trend and seasonality [1]

[1] http://robjhyndman.com/hyndsight/hw-initialization/