FORECAST PRO EXAMPLE #2: AUTOSALE revisited.
First, as a benchmark, here is a battery of models fitted in the Forecasting procedure in Statgraphics. Models A and B are the simple and linear exponential smoothing models with multiplicative seasonal adjustment. A fixed-rate inflation adjustment of 0.5% per period has been added to the simple exponential smoothing model to put some long-term trend into its forecasts and reduce error heteroscedasticity. The Winter's seasonal smoothing model has been specified as model C. Two ARIMA models fitted to the logged data are included as models D and E.
Notice that all the models are quite similar in their all-around performance in the estimation period, and models A-D are very similar in the validation period, with model E lagging a bit behind. Interestingly, the ARIMA models did relatively better in terms of percentage error because, being fitted in log units, they were actually optimized in terms of percentage error rather than absolute error.
Model Comparison
----------------
Data variable: AUTOSALE
Number of observations = 281
Start index = 1/70
Sampling interval = 1.0 month(s)
Length of seasonality = 12
Number of periods withheld for validation: 36
Models
------
(A) Simple exponential smoothing with alpha = 0.4021
Inflation adjustment: 0.5 percent at end of period
Seasonal adjustment: Multiplicative
(B) Brown's linear exp. smoothing with alpha = 0.1375
Seasonal adjustment: Multiplicative
(C) Winter's exp. smoothing with alpha = 0.3117, beta = 0.0001, gamma = 0.2495
(D) ARIMA(0,1,1)x(0,1,1)12
Math adjustment: Natural log
(E) ARIMA(1,0,1)x(0,1,1)12 with constant
Math adjustment: Natural log
Estimation Period
Model MSE MAE MAPE ME MPE
------------------------------------------------------------------------
(A) 1.40991 0.770647 4.70927 0.0662132 0.276835
(B) 1.41955 0.791537 5.02856 0.0248242 0.282996
(C) 1.53718 0.842734 5.12773 0.0154446 -0.454623
(D) 1.58972 0.823505 4.77485 -0.0302983 -0.279636
(E) 1.56714 0.819359 4.71859 0.0307884 0.143125
Model RMSE RUNS RUNM AUTO MEAN VAR
-----------------------------------------------
(A) 1.1874 OK OK *** OK ***
(B) 1.19145 OK *** *** OK ***
(C) 1.23983 OK * OK OK ***
(D) 1.26084 OK OK * OK **
(E) 1.25185 OK OK * * **
Validation Period
Model MSE MAE MAPE ME MPE
------------------------------------------------------------------------
(A) 1.09317 0.947184 2.94464 -0.12697 -0.532003
(B) 1.23223 0.951346 2.94155 0.234821 0.57216
(C) 1.35863 0.959531 2.97817 0.00353942 -0.144202
(D) 1.15307 0.912145 2.82051 -0.144336 -0.571749
(E) 2.20643 1.19816 3.76333 -1.06931 -3.37841
Here are the estimated coefficients of the ARIMA(0,1,1)x(0,1,1) model:
ARIMA Model Summary
Parameter Estimate Stnd. Error t P-value
----------------------------------------------------------------------------
MA(1) 0.50284 0.0570327 8.8167 0.000000
SMA(1) 0.896011 0.0243729 36.7626 0.000000
----------------------------------------------------------------------------
Backforecasting: yes
Estimated white noise variance = 0.0040629 with 266 degrees of freedom
Estimated white noise standard deviation = 0.0637409
Number of iterations: 7
Now let's fit the same series using the built-in expert in Forecast Pro:
Forecast Pro for Windows Standard Edition Version 2.00
Sat Oct 05 17:12:33 1996
Expert data exploration of dependent variable AUTOSALE
---------------------------------------------------------------------
Length 245 Minimum 4.485 Maximum 37.068
Mean 16.781 Standard deviation 8.770
Classical decomposition (multiplicative)
Trend-cycle: 96.53% Seasonal: 2.55% Irregular: 0.92%
Log transform recommended for Box-Jenkins.
There are no strongly significant regressors, so I will choose
a univariate method.
Exponential smoothing outperforms Box-Jenkins by 1.754 to 1.939
out-of-sample (MAD). I tried 78 forecasts up to a maximum horizon 12.
For Box-Jenkins, I used a log transform.
Out-of-sample forecast errors are used by Forecast Pro to compare models of different types during the automatic data exploration phase. The 78 forecasts are made on a rolling basis for a 12-month period, yielding 12 one-step ahead forecasts, 11 two-step-ahead forecasts, etc., for a total of 78 forecasts at various horizons within the same year. This is a generally good method to compare models, but keep in mind that it focuses only on a single year: some models may get luckier than others in that year!
Series is trended and seasonal. Recommended model: Exponential smoothing
OK--let's use exponential smoothing with the "automatic" estimation option. Also, let's use 36 months for the validation period, as we did in Statgraphics.
Forecast Model for AUTOSALE
Automatic model selection
Multiplicative Winters: Linear trend, Multiplicative seasonality
Confidence limits proportional to indexes and level
Smoothing Final
Component Weight Value
--------------------------------------
Level 0.29731 32.557
Trend 0.01889 0.12847
Seasonal 0.22900 1.1080
Seasonal Indexes
----------------------------------------------------------
January - March 0.87334 0.89532 1.06630
April - June 1.05658 1.10797 1.11531
July - September 1.06250 1.07803 1.01079
October - December 0.96825 0.90960 0.90090
Standard Diagnostics
-------------------------------------------------------------
Sample size 245 Number of parameters 3
Mean 16.78 Standard deviation 8.788
R-square 0.9816 Adjusted R-square 0.9814
Durbin-Watson 1.91 Ljung-Box(18)=20.49 P=0.6942
Forecast error 1.198 BIC 1.232 (Best so far)
MAPE 0.04891 RMSE 1.191
MAD 0.7953
The Ljung-Box statistic is a test of the significance of the sum of squares of the first n residual autocorrelations--i.e., a test for the "total autocorrelation" in the residuals--where n=18 was used here. A "good" value of this statistic is a number not much larger than n, and the P-value as reported here ought to be less than 0.95. This test shows that the overall amount of autocorrelation in the residuals is acceptably low.
Rolling simulation results
Cumulative Cumulative
H N MAD Average MAPE Average
---------------------------------------------------------------------
1 36 0.974 0.974 0.030 0.030
2 35 0.982 0.978 0.031 0.031
3 34 1.074 1.009 0.034 0.032
4 33 1.237 1.064 0.039 0.033
5 32 1.309 1.110 0.041 0.035
6 31 1.554 1.178 0.049 0.037
7 30 1.697 1.246 0.053 0.039
8 29 1.742 1.301 0.053 0.041
9 28 1.843 1.354 0.055 0.042
10 27 1.960 1.406 0.058 0.043
11 26 2.020 1.453 0.060 0.045
12 25 2.147 1.500 0.064 0.046
13 24 2.226 1.545 0.067 0.047
14 23 2.188 1.580 0.066 0.048
15 22 2.308 1.617 0.069 0.049
16 21 2.336 1.650 0.070 0.050
17 20 2.419 1.683 0.072 0.051
18 19 2.496 1.714 0.074 0.052
19 18 2.518 1.742 0.074 0.053
20 17 2.470 1.765 0.071 0.053
21 16 2.569 1.789 0.073 0.054
22 15 2.582 1.810 0.073 0.054
23 14 2.413 1.825 0.068 0.055
24 13 2.360 1.837 0.066 0.055
25 12 2.168 1.843 0.060 0.055
26 11 1.857 1.844 0.050 0.055
27 10 1.766 1.842 0.048 0.055
28 9 1.223 1.833 0.033 0.055
29 8 0.844 1.821 0.023 0.054
30 7 0.785 1.810 0.023 0.054
31 6 0.832 1.801 0.023 0.054
32 5 0.843 1.794 0.024 0.053
33 4 0.970 1.789 0.027 0.053
34 3 0.958 1.785 0.025 0.053
35 2 0.663 1.781 0.017 0.053
36 1 1.194 1.781 0.030 0.053
Notice that the estimated coefficients are similar but not identical to those found by Statgraphics--with nonlinear estimation, it's not guaranteed that you will get exactly the same results, and the two programs may perform backforecasting or otherwise initialize the models in different ways. (Fortunately, Forecast Pro prints out the final estimates of the seasonal indices, unlike Statgraphics. Neither program reveals the starting values it uses, which is one reason the Winters model is a bit of a black box.) The first row of the "rolling simulation rsults"--which shows the 1-step ahead error statistics in the validation period--agrees reasonably closely with what we obtained in Statgraphics (MAE=0.96 in SG versus MAD=0.974 here).
This model gets very lucky with its long-term forecasts near the end of the validation period, as shown here;

Now let's try Box-Jenkins with a log transform. (We need to return to the "tableau" to add the log transformation.) We'll start by using the "automatic" estimation option here too.
Forecast Model for AUTOSALE (Log transform) Automatic model selection ARIMA(0,1,1)*(2,0,3) Term Coefficient Std. Error t-Statistic Significance --------------------------------------------------------------------- b[1] 0.4733 0.0571 8.2876 1.0000 A[12] 0.4403 0.2779 1.5844 0.8869 A[24] 0.5578 0.2776 2.0092 0.9555 B[12] 0.1197 0.2725 0.4393 0.3396 B[24] 0.5286 0.2115 2.4995 0.9876 B[36] 0.2467 0.0693 3.5622 0.9996 Embedded insignificant AR terms -- consider dynamic regression.
Yike--what happened here? How did we end up with SAR=2 and SMA=3 with NO seasonal difference? At first glance this model appears VERY strange, but upon closer inspection, some interesting facts emerge. Notice that the two SAR coefficients add up to 0.9981--i.e., almost exactly 1.0. This means there is a UNIT ROOT IN THE SAR PART OF THE MODEL!! Effectively the model is performing a seasonal difference after all--it's just buried in the SAR part of the model. What we ought to do (minimally) is add a seasonal difference while reducing SAR from 2 to 1--and I would try reducing the orders of seasonal terms even further. (I suspect the "expert system" in this program is trying too hard to eliminate the stubborn autocorrelation we saw around the seasonal period when we fitted the series in Statgraphics.) But let's continue for now...
Standard Diagnostics
------------------------------------------------------------
Sample size 244 Number of parameters 6
Mean 2.682 Standard deviation 0.5428
R-square 0.9876 Adjusted R-square 0.9873
Durbin-Watson 1.894 Ljung-Box(18)=28.8 P=0.9491
Forecast error 0.06107 BIC 0.9432 (Best so far)
MAPE 0.04579 RMSE 1.165
MAD 0.7542
Rolling simulation results
Cumulative Cumulative
H N MAD Average MAPE Average
---------------------------------------------------------------------
1 36 0.893 0.893 0.028 0.028
2 35 0.890 0.892 0.028 0.028
3 34 0.922 0.902 0.030 0.028
4 33 1.119 0.954 0.036 0.030
5 32 1.080 0.977 0.035 0.031
6 31 1.276 1.023 0.041 0.033
7 30 1.389 1.071 0.044 0.034
8 29 1.442 1.112 0.045 0.035
9 28 1.531 1.153 0.046 0.036
10 27 1.644 1.195 0.049 0.037
11 26 1.728 1.236 0.051 0.038
12 25 1.907 1.281 0.057 0.040
13 24 1.961 1.323 0.059 0.041
14 23 1.957 1.359 0.060 0.042
15 22 2.109 1.397 0.064 0.043
16 21 2.221 1.435 0.067 0.044
17 20 2.387 1.475 0.072 0.045
18 19 2.483 1.513 0.075 0.046
19 18 2.477 1.547 0.073 0.047
20 17 2.497 1.578 0.073 0.048
21 16 2.676 1.610 0.077 0.049
22 15 2.601 1.636 0.074 0.050
23 14 2.412 1.655 0.068 0.050
24 13 2.515 1.674 0.071 0.051
25 12 2.370 1.688 0.066 0.051
26 11 2.311 1.699 0.064 0.051
27 10 2.380 1.710 0.068 0.051
28 9 2.214 1.717 0.064 0.052
29 8 1.991 1.721 0.059 0.052
30 7 2.131 1.725 0.065 0.052
31 6 2.267 1.730 0.066 0.052
32 5 2.578 1.737 0.073 0.052
33 4 3.080 1.745 0.084 0.052
34 3 3.029 1.751 0.078 0.052
35 2 2.699 1.754 0.068 0.053
36 1 4.035 1.757 0.101 0.053
This model actually outperforms the Winters model in its 1-step-ahead forecasts in the validation period, but it doesn't get quite as lucky with its long-term forecasts. (It's Ljung-Box statistic is also a little worse.) The confidence intervals widen fairly rapidly because of the effect of the log transformation:
Now here's one of the more "standard" seasonal ARIMA models that we identified earlier in Statgraphics:
Forecast Model for AUTOSALE (Log transform)
ARIMA(0,1,1)*(0,1,1)
Term Coefficient Std. Error t-Statistic Significance
---------------------------------------------------------------------
b[1] 0.4981 0.0575 8.6632 1.0000
B[12] 0.8896 0.0287 30.9505 1.0000
Standard Diagnostics
-------------------------------------------------------------
Sample size 232 Number of parameters 2
Mean 2.733 Standard deviation 0.5058
R-square 0.985 Adjusted R-square 0.9849
Durbin-Watson 1.896 Ljung-Box(18)=24.39 P=0.8575
Forecast error 0.06218 BIC 0.9751
MAPE 0.04774 RMSE 1.252
MAD 0.8215
Rolling simulation results
Cumulative Cumulative
H N MAD Average MAPE Average
---------------------------------------------------------------------
1 36 0.907 0.907 0.028 0.028
2 35 0.947 0.927 0.030 0.029
3 34 0.974 0.942 0.031 0.030
4 33 1.207 1.005 0.039 0.032
5 32 1.209 1.044 0.039 0.033
6 31 1.379 1.095 0.044 0.035
7 30 1.544 1.154 0.049 0.037
8 29 1.551 1.198 0.048 0.038
9 28 1.644 1.241 0.050 0.039
10 27 1.814 1.290 0.054 0.040
11 26 1.820 1.331 0.055 0.042
12 25 1.931 1.372 0.058 0.043
13 24 1.964 1.408 0.060 0.044
14 23 1.819 1.431 0.057 0.044
15 22 2.037 1.462 0.063 0.045
16 21 2.151 1.493 0.066 0.046
17 20 2.204 1.523 0.068 0.047
18 19 2.304 1.553 0.070 0.048
19 18 2.246 1.578 0.067 0.049
20 17 2.332 1.602 0.068 0.049
21 16 2.425 1.626 0.069 0.050
22 15 2.518 1.650 0.071 0.051
23 14 2.526 1.671 0.071 0.051
24 13 2.510 1.690 0.071 0.052
25 12 2.417 1.704 0.068 0.052
26 11 2.566 1.720 0.074 0.052
27 10 2.769 1.737 0.081 0.053
28 9 2.596 1.749 0.077 0.053
29 8 2.655 1.760 0.079 0.053
30 7 3.202 1.776 0.095 0.054
31 6 3.329 1.790 0.096 0.054
32 5 3.646 1.804 0.103 0.055
33 4 4.122 1.818 0.112 0.055
34 3 4.173 1.829 0.107 0.055
35 2 4.019 1.836 0.101 0.055
36 1 5.210 1.841 0.130 0.055
Notice that its 1-period-ahead error statistics agree reasonably well with what was obtained in Statgraphics (MAD=0.907 here versus MAE=0.912 in SG). The estimated coefficients are close too (b[1]=0.498 and b[12]=0.889 here versus MA(1)=.503 and SMA(1)=0.896 in SG.) This model is a littles less lucky in its long-term forecasts, but its average performance in the validation period is not much different from those of the previous models--and it is MUCH simpler!
Finally, here's the other ARIMA model we tried in Statgraphics, which substitutes an AR(1) term for the nonseasonal difference and thereby estimates the long-term trend in the series rather than the local trend:
Forecast Model for AUTOSALE (Log transform)
ARIMA(1,0,1)*(0,1,1)
Term Coefficient Std. Error t-Statistic Significance
---------------------------------------------------------------------
a[1] 0.9198 0.0326 28.2486 1.0000
b[1] 0.4221 0.0735 5.7396 1.0000
B[12] 0.8924 0.0271 32.9531 1.0000
_CONST 0.0072 0.0030 2.3964 0.9834
Standard Diagnostics
-------------------------------------------------------------
Sample size 233 Number of parameters 3
Mean 2.729 Standard deviation 0.5097
R-square 0.9857 Adjusted R-square 0.9856
Durbin-Watson 1.938 Ljung-Box(18)=24.78 P=0.8688
Forecast error 0.06123 BIC 0.9647
MAPE 0.04699 RMSE 1.237
MAD 0.8149
Rolling simulation results
Cumulative Cumulative
H N MAD Average MAPE Average
---------------------------------------------------------------------
1 36 1.207 1.207 0.038 0.038
2 35 1.683 1.442 0.053 0.045
3 34 2.160 1.674 0.068 0.053
4 33 2.694 1.918 0.085 0.060
5 32 3.143 2.149 0.098 0.067
6 31 3.595 2.372 0.112 0.074
7 30 4.033 2.587 0.125 0.081
8 29 4.374 2.787 0.134 0.087
9 28 4.699 2.973 0.143 0.092
10 27 5.027 3.149 0.151 0.097
11 26 5.257 3.309 0.158 0.102
12 25 5.519 3.460 0.166 0.106
13 24 5.931 3.612 0.178 0.111
14 23 6.148 3.754 0.185 0.115
15 22 6.443 3.890 0.194 0.119
16 21 6.637 4.016 0.200 0.123
17 20 6.861 4.136 0.206 0.126
18 19 7.101 4.250 0.212 0.129
19 18 7.310 4.357 0.216 0.132
20 17 7.485 4.457 0.217 0.135
21 16 7.771 4.554 0.223 0.138
22 15 8.058 4.648 0.229 0.140
23 14 8.090 4.732 0.230 0.142
24 13 8.194 4.808 0.233 0.144
25 12 8.296 4.878 0.236 0.146
26 11 8.402 4.942 0.241 0.148
27 10 8.633 5.001 0.249 0.150
28 9 8.555 5.052 0.246 0.151
29 8 8.759 5.098 0.252 0.152
30 7 9.081 5.142 0.262 0.153
31 6 9.190 5.179 0.260 0.154
32 5 9.571 5.212 0.266 0.155
33 4 10.061 5.242 0.271 0.156
34 3 10.389 5.265 0.265 0.156
35 2 10.192 5.280 0.256 0.157
36 1 11.312 5.289 0.283 0.157
Despite its good intentions in trying to estimate the long-term trend in a more stable manner than the other models, this model gets embarrassed in the validation period because it fails to respond to the cyclical downturn which happens to occur right at the end of the estimation period (around the beginning of '91). The other models picked this up in one way or another and therefore were luckier in their long-term forecasts for the next three years. ("The race is not always to the swift nor the battle to the strong....") This model's 1-step-ahead performance is not too bad, though.