Estimation of ARIMA
models
Linear versus
nonlinear least squares
Mean versus constant
Backforecasting
Linear
versus nonlinear least squares
ARIMA models which include only
AR terms are special cases of
linear regression models, hence they can be fitted by ordinary
least squares.
- AR forecasts are a linear function of the
coefficients as
well as a linear function of past data.
- In principle, least-squares estimates of AR coefficients
can
be exactly calculated from autocorrelations in a single "iteration".
- In practice, you can fit an AR model in the Multiple
Regression
procedure--just regress DIFF(Y) (or whatever) on lags of itself.
(But you would get slightly different results from the ARIMA
procedure--see
below!)
ARIMA models which include MA
terms are similar to regression
models, but can't be fitted by ordinary least squares:
- Forecasts are a linear function of past
data, but they are
nonlinear functions of coefficients--e.g., an ARIMA(0,1,1)
model without constant is an exponentially weighted moving average:

...in which the forecasts are a nonlinear function of the MA(1)
parameter ("theta").
- Another way to look at the problem: you can't fit MA
models
using ordinary multiple regression because there's no way to specify
ERRORS as an independent variable--the errors are not known until
the model is fitted! They need to be calculated sequentially,
period by period, given the current parameter estimates.
- MA models therefore require a nonlinear estimation
algorithm
to be used, similar to the "Solver" algorithm in Excel.
- The algorithm uses a search process that typically
requires
5 to 10 "iterations" and occasionally may not converge.
- You can adjust the tolerances for determining step sizes
and
stopping criteria for search (although default values are usually
OK).
"Mean"
versus "constant
The "mean" and the "constant"
in ARIMA model-fitting
results are different numbers whenever the model includes AR terms.
Suppose that you fit an ARIMA model to Y in which p is the number
of autoregressive terms. (Assume for convenience that there are
no MA terms.) Let y denote the differenced (stationarized) version
of Y--e.g., y(t) = Y(t)-Y(t-1) if one nonseasonal difference was
used. Then the AR(p) forecasting equation for y is:
This is just an ordinary
multiple regression model in which "mu"
is the constant term, "phi-1" is the coefficient of
the first lag of y, and so on.
Now, internally, the software
converts this slope-intercept form
of the regression equation to an equivalent form in terms of deviations
from the mean. Let m denote the mean of the stationarized
series y. Then the p-order autoregressive equation can be written
in terms of deviations from the mean as:
By collecting all the constant
terms in this equation, we see
it is equivalent to the "mu" form of the equation if:
The software actually estimates
"m" (along with the
other model parameters) and reports this as the MEAN in the
model-fitting
results, along with its standard error and t-statistic, etc. The
CONSTANT (i.e., "mu") is then calculated according to
the preceding formula, i.e.,
CONSTANT = MEAN*(1 - sum of
AR coefficients)
If the model does not
contain any AR terms, the MEAN and
the CONSTANT are identical.
In a model with one order of
nonseasonal differencing (only),
the MEAN is the trend factor (average period-to-period change).
In a model with one order of seasonal differencing (only),
the MEAN is the annual trend factor (average year-to-year
change).
"Backforecasting"
- The basic
problem: an ARIMA model (or other time series model)
predicts future values of the time series from past values--but
how should the forecasting equation be initialized to make
a forecast for the very first observation? (Actually, AR models
can be initialized by dropping the first few observations--although
this is inefficient and wastes data-- but MA models require an
estimate of a prior error before they can make the first forecast.)
- Strange but true: a stationary
time series looks
the same going forward or backward in time,
therefore...
- The same model that predicts
the future of a series
can also be used to predict its past.
- The solution: to squeeze the
most information out of the available
data, the best way to initialize an ARIMA model (or any time series
forecasting model) is to use backward forecasting
("backforecasting")
to obtain estimates of data values prior to period 1.
- When you use the
backforecasting option in ARIMA estimation,
the search algorithm actually makes two passes through the data
on each iteration: first a backward pass is made to estimate prior
data values using the current parameter estimates, then the estimated
prior data values are used to initialize the forecasting equation
for a forward pass through the data.
- If you DON'T use the
backforecasting option, the forecasting
equation is initialized by assuming that prior values of the
stationarized
series were equal to the mean.
- If you DO use the
backforecasting option, then the backforecasts
that are used to initialize the model are implicit parameters
of the model, which must be estimated along with the AR and
MA coefficients. The number of additional implicit parameters
is roughly equal to the highest lag in the model--usually 2 or
3 for a nonseasonal model, and s+1 or 2s+1 for a seasonal model
with seasonality=s. (If the model includes both a seasonal difference
and a seasonal AR or MA term, it needs two season's worth
of prior values to start up!)
- Note that with either
backforecasting option, an AR model
is estimated in a different way than it would be estimated in
the Multiple Regression procedure (missing values are not merely
ignored--they are replaced either with an estimate of the mean
or with backforecasts), hence an AR model fitted in the ARIMA
procedure will never yield exactly the same parameter estimates
as an AR model fitted in the Multiple Regression procedure.
- Conventional wisdom: turn
backforecasting OFF when you are
unsure if the current model is valid, turn it ON to get final
parameter estimates once you're reasonably sure the model is valid.
- If the model is
mis-specified, backforecasting may lead to
failures of the parameter estimates to converge and/or to unit-root
problems.