Using Holt-Winters for forecasting in Python

python statistics forecasting

I tried generating random data until I got interesting results. Here I fed in all positive numbers and got negative forecasts:

y = [0.92, 0.78, 0.92, 0.61, 0.47, 0.4, 0.59, 0.13, 0.27, 0.31, 0.24, 0.01]holtwinters(y, 0.2, 0.1, 0.05, 4)...forecast: -0.104857182966forecast: -0.197407475203forecast: -0.463988558577forecast: -0.258023593197

but note that the forecast fits the negative slope of the data.

This might be the orders of magnitude you were talking about:

y = [0.1, 0.68, 0.15, 0.08, 0.94, 0.58, 0.35, 0.38, 0.7, 0.74, 0.93, 0.87]holtwinters(y, 0.2, 0.1, 0.05, 4)...forecast: 1.93777559066forecast: 3.11109138055forecast: 0.910967977635forecast: 0.684668348397

But I'm not sure how you'd deem it wildly inaccurate or judge that it "should be" lower.

Whenever you're extrapolating data, you're going to have somewhat surprising results. Are you concerned more that the implementation might be incorrect or that the output doesn't have good properties for your specific usage?

python statistics forecasting

Firstable, if you're unsure about your specific implementation of the algorithm, I recommend that you create some testcase for that. Take another implementation, maybe matlab, whatever, anything that you know it works. Generate some inputs, feed it to the reference and to your implementation, and it should be identical. I have translated and verified some algorithms from matlab that way. scipy.io.loadmat is great for that.

About your usage of the algorithm: You're talking about periodicities in days and weeks, and you feed data on a minutes timescale. I don't know if this specific algorithm handles that well, but in any case I'd suggest, to try some lowpass filtering and then feeding it into the algorithm hourly, or even slower. Nearly 700 timesteps for one period could be just too much to recognize. The data that you feed in should also contain a minimum two complete periods of your timeseries. If your algorithm supports periodicity, you also have to provide it with data in an appropriate way, so it can actually see the periodicity. The fact, that you get these extrem values could be a hint, that the algorithm only has date for a steady trend in one direction.

Maybe you also want to separate your forcasts to have one optimized for weekly prediction, and the other one intraday, and you combine them in the end again.

python statistics forecasting

I think the problem with this method is how they calculate the initial values. They seems to be using a linear model when:

This is a very poor method that should not be used as the trend will be biased by the seasonal pattern. Imagine a seasonal pattern, for example, where the last period of the year is always the largest value for the year. Then the trend will be biased upwards. Unfortunately, Bowerman, O’Connell and Koehler (2005) are not alone in recommending bad methods. I’ve seen similar, and worse, procedures recommended in other books. [1]

a better method si decomposinf the timeseries in trend and seasonality [1]

[1] http://robjhyndman.com/hyndsight/hw-initialization/

CodeHunter

Using Holt-Winters for forecasting in Python

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last