Time series forecasting, dealing with known big orders Time series forecasting, dealing with known big orders r r

Time series forecasting, dealing with known big orders


Your outliers appear to be seasonal variations with the largest orders appearing in the 4-th quarter. Many of the forecasting models you mentioned include the capability for seasonal adjustments. As an example, the simplest model could have a linear dependence on year with corrections for all seasons. Code would look like:

df <- data.frame(period= c("08Q1","08Q2","08Q3","08Q4","09Q1","09Q2","09Q3","09Q4","10Q1","10Q2","10Q3",                       "10Q4","11Q1","11Q2","11Q3","11Q4","12Q1","12Q2","12Q3","12Q4","13Q1","13Q2",                       "13Q3","13Q4","14Q1","14Q2","14Q3","14Q4","15Q1"),                 order= c(155782698, 159463653.4, 172741125.6, 204547180, 126049319.8, 138648461.5,                        135678842.1, 242568446.1, 177019289.3, 200397120.6, 182516217.1, 306143365.6,                        222890269.2, 239062450.2, 229124263.2, 370575384.7, 257757410.5, 256125841.6,                        231879306.6, 419580274, 268211059, 276378232.1, 261739468.7, 429127062.8, 254776725.6,                        329429882.8, 264012891.6, 496745973.9, 42748656.73))seasonal <- data.frame(year=as.numeric(substr(df$period, 1,2)), qtr=substr(df$period, 3,4), data=df$order)ord_model <- lm(data ~ year + qtr, data=seasonal)seasonal <- cbind(seasonal, fitted=ord_model$fitted)library(reshape2)library(ggplot2)plot_fit <- melt(seasonal,id.vars=c("year", "qtr"), variable.name = "Source", value.name="Order" )ggplot(plot_fit, aes(x=year, y = Order, colour = qtr, shape=Source)) + geom_point(size=3)

which gives the results shown in the chart below:Linear fit with seasonal adjustments

Models with a seasonal adjustment but nonlinear dependence upon year may give better fits.


You already said you tried different Arima-models, but as mentioned by WaltS, your series don't seem to contain big outliers, but a seasonal-component, which is nicely captured by auto.arima() in the forecast package:

myTs <- ts(as.numeric(data[,2]), start=c(2008, 1), frequency=4) myArima <- auto.arima(myTs, lambda=0)myForecast <- forecast(myArima)plot(myForecast)

enter image description here

where the lambda=0 argument to auto.arima() forces a transformation (or you could take log) of the data by boxcox to take the increasing amplitude of the seasonal-component into account.


enter image description here
The approach you are trying to use to cleanse your data of outliers is not going to be robust enough to identify them. I should add that there is a free outlier package in R called tsoutliers, but it won't do the things I am about to show you....

You have an interesting time series here. The trend changes over time with the upward trend weakening a bit. If you bring in two time trend variables with the first beginning at 1 and another beginning at period 14 and forward you will capture this change. As for seasonality, you can capture the high 4th quarter with a dummy variable. The model is parsimonios as the other 3 quarters are not different from the average plus no need for an AR12, seasonal differencing or 3 seasonal dummies. You can also capture the impact of the last two observations being outliers with two dummy variables. Ignore the 49 above the word trend as that is just the name of the series being modeled. Actual, Fit, Forecasts with Confidence limits