Automatic Forecasting Software
There are a number of software packages on the market that advertise
"automatic" forecasting capabilities: you put a time series in, and the
package automatically identifies and fits the "best" model from among some
class of models. Some of them bill themselves as "expert systems" on relatively
flimsy grounds, while others more-or-less live up to their claims. With
regard to such software, I would offer the same warning as with stepwise
regression: properly used, and guided by experience with "manual" model-fitting,
the best of such software can put more data-analysis power at your fingertips
and speed up routine forecasting applications. Carelessly used, it merely
allows you to foul things up in a bigger way, obtaining results without
insight while getting a false sense of security that "the computer knows
best." Automatic forecasting software is a complement to, not a substitute
for, your own forecasting expertise.
When evaluating such software,
here are a few points to keep in mind:
-
What kind of exploratory data analysis does it enable you to do
prior to model-fitting? What kind of graphical and statistical reporting
capabilities does it have in general? What sort of data transformations
(e.g., differencing, logging, deflating) does it make available?
-
What class of models does it scan through in order to come up with
a best model? In particular, is it limited to smoothing and 1- or 2-variable
regression models, or does it also look at multiple regression and ARIMA,
or even more sophisticated models? What methods does it use to handle seasonality?
Is this class of models adequate for your purposes?
-
What sort of diagnostic information (e.g., residual plots and statistics)
does it provide after fitting a model? Does it perform goodness-of-fit
tests and out-of-sample validation? Does it warn you if the modeling assumptions
are not satisfied or the model otherwise does not fit well? Does it show
you time series plots, probability plots, and ACF plots of the residuals,
so you can decide for yourself?
-
What sort of control, if any, are you allowed to exercise over the
model selection process? Can you force it to try a model of your choice?
-
What sort of audit trail does it leave to document the model-fitting
process? How much does it tell you about the criteria it used to determine
the best model? Is it open and forthright, or is it an inscrutable "black
box" using some mysterious proprietary algorithm?
-
What capability does it have for importing and exporting data in
forms that can be read by other programs (e.g., ASCII files, spreadsheet
files, etc.). Can you easily get data into it from your corporate database,
if necessary? Does it have its own programming/command/macro language for
automating routine analyses?
-
Can it be used to build a system for forecasting large numbers of time
series? How large a data set can it handle?
-
To what extent, if any, can models be customized to take into account the
unique features of your data?
Forecast Pro for Windows, developed by Robert Goodrich and Eric
Stellwagen (Business Forecast Systems Inc., Belmont, Massachusetts) appears
to me to be one of the better of these programs. It offers capabilities
for fitting the most commonly-used models--exponential smoothing, multiple
regression, and ARIMA--and it provides decent diagnostic support while
offering model-selection advice which is usually sound.
Comparison of features between Forecast Pro and Statgraphics
Common features of both programs:
-
Can be used to automatically fit exponential smoothing models, ARIMA models,
and dynamic regression models (regression models with lags of dependent
and/or independent variables and/or forecast errors)
-
Uses sophisticated nonlinear estimation and backforecasting
-
Performs out-of-sample validation
-
Automatically performs a battery of residual diagnostic tests
Advantages and/or nice features of Forecast Pro:
-
Built-in "expert system" helps you select models and compare them
-
Automatically tests for significance of next-higher lags of all variables
and errors in regression models
-
"Rolling simulation" feature tests univariate models out-of-sample at a
number of different forecasting horizons (up to 12 months)
-
"Multi-level" option allows you to generate forecasts at different levels
of aggregation so that they "add up"
-
"Batch" mode allows automatic forecasting of many series in one step
-
Includes the capability to add event (e.g. promotion) variables to exponential
smoothing models
-
Includes specialized models for "intermittent" data
Disadvantages and/or caveats:
-
Designed expressly for forecasting, not general-purpose statistical data
analysis and modeling
-
Limited capabilities for manual data exploration (no scatterplots, cross-correlation
plots, horizontal residual plots, probability plots, correlation matrices,
etc.)
-
Does not have some regression options (tests for influential observations
and multicollinearity, general nonlinear regression, automatic forward
and backward stepwise, all-possible regressions, etc.)
-
Built-in "expert" does not understand your data, does not consider all
modeling possibilities (e.g., stationarizing transformations, deflation,
missing variables) and is not necessarily right. You still have
to look over its shoulder!
-
Remember that you, not the built-in expert, must take responsibility for
the final results and explain the model to the client.
Examples of automated analysis with Forecast Pro (v. 2):
Example #1: Department
store data from assignment #3
Example #2: Time series
models for AUTOSALE series