Geometric random walk model


Logarithm transformations of stock price data:  Previously we saw that a random walk model with a constant term was capable of describing a series with irregular linear growth. What if the series displays irregular exponential growth? For example, here is a plot of the S&P500 stock index, monthly closing values from 1971 to 1995.

If we take the first difference of this series, we obtain a series whose variance increases as the level of the original series rises over time (i.e., a heteroscedastic series):

This suggests that random growth is taking place in percentage terms, rather than absolute terms. We can linearize the exponential growth in the original series, and also stabilize the variance of the differences, by applying a logarithm transformation. Here's a plot of LOG(SP500):

Notice that the growth pattern now appears much more linear, and the variance of the fluctuations over time is indeed more uniform. The stabilization of the variance is confirmed by a plot of the first difference of the logged series, which is DIFF(LOG(SP500)) in SGWIN:

Now recall that the first difference of the logarithm is essentially the percentage difference in the original series, so what we have actually plotted here is the monthly percent return on the S&P500 index (ex-dividends). The stationary and random appearance of this graph suggests that the random walk model is appropriate for the logged series, and it should include a constant term (i.e., positive growth) to account for the trend in the original series.


Geometric random walk model:  Application of the random walk model to the logged series implies that the forecast for the next month's value of the original series will equal the previous month's value plus a constant percentage increase. To see this, note that the random walk model for LOG(Y) is given by the equation:

where the constant term (alpha) is the average monthly change in LOG(Y), which is approximately the average monthly percentage change in Y. For the S&P 500 series, alpha is equal to 0.0056, representing an average monthly increase of 0.56%. Exponentiating both sides of the preceding equation, and using the fact that EXP(x) is approximately equal to 1+x for small x, we obtain:

This forecasting model is known as a geometric random walk model, and it is the default model commonly used for stock market data. In SGWIN, you specify this model as a random-walk-with-growth model (i.e., an ARIMA model with one nonseasonal difference and a constant term) in conjunction with a log transformation. (Click the "Natural log" button at the top of the Model Specification panel and the "ARIMA" button below, etc.) Here's a plot of the forecasts produced by this model for the S&P500 series:

The forecasts grow at a rate equal to the average monthly increase within the sample, which is 0.56%. This number shows up as the estimated constant in the Analysis Summary report for the model. (Return to top of page.)


A "random walk down Wall Street":   The fact that stock prices behave at least approximately like a (geometric) random walk is the most striking empirical fact about financial markets.  But is it or isn't it a true random walk?  If it is, then stock prices are inherently unpredictable except in terms of long-run-average risk and return.  The best you can hope to do is to correctly estimate the  average returns and volatilities of stocks, along with their correlations, and use these statistics to determine efficient portfolios that achieve a desired risk-return tradeoff.  You can't hope to beat the market by microanalyzing patterns in stock price movements--you might as well buy-and-hold an efficient portfolio.

The random walk hypothesis was first formalized by the French mathematician (and stock analyst) Louis Bachelier in 1900, and in the past century it has been exhaustively studied and debated.  The intuition for the random walk hypothesis is a variation on the economist's classical efficiency argument, which holds that a $100 bill will never be found lying on the sidewalk because someone else would have picked it up first.  If it could be predicted from publicly available information that an abnormally large positive stock return will occur tomorrow, then the price of the stock should already have gone up today, in which case the anticipated return would already have been realized, and tomorrow's return should be normal after all.  What is not quite so obvious is why the volatility of stock returns should be approximately constant, as the basic random walk model assumes.   (It is not exactly constant in practice, but it does tend to revert to an average volatility level over the long term.)  Evidently investors naturally think in terms of percentage changes when it comes to financial asset prices, so that similar informational events (earnings reports, changes in interest rates or inflation, etc.) tend to lead to similar percentage changes in stock prices at different points in time.

Nowadays, three different forms of the random walk hypothesis are commonly distinguished.  The weak form holds that future stock returns cannot be predicted from past returns (except in the long-run-average sense), which rules out so-called "technical analysis" and stock-trading strategies such as "filter rules" in which buying and selling are triggered when stock prices hit target levels determined by recent highs and lows.  The semi-strong form holds that future stock returns cannot be predicted from past returns even together with other publicly available information such as corporate statements and analyst reports, which rules out the possibility that actively managed mutual funds will outperform a broad market index (except by luck).  The strong form holds that stock returns cannot be predicted from any available information, even inside information not available to the general public, which rules out excess profits from insider trading.  The strong form is obviously too strong to be plausible:  prices are moved by information (as well as by animal spirits, etc.), and those who are able to trade on the  information first, before prices have moved very much, should expect to profit.    The general consensus seems to be that the truth lies somewhere between the semi-strong and the strong form:  technical analysis has not been validated in controlled studies (although it still has passionate defenders), and actively managed mutual funds generally do not consistently outperform the market index, although inside information does create opportunities for (illegal) excess profits.

On reflection, it is intuitively reasonable that stock prices should follow a random walk approximately, but not exactly.  Small-scale patterns in stock prices should always be emerging and then dissipating as information is received by the most alert and savvy market participants, who then trade on it until the rest of the market catches on.   However, because of market frictions (transaction costs such as bid-ask spreads, brokerage fees, taxes, etc.), these patterns will usually be too small for the typical market observer to profit from.   For most practical purposes, for most people, it is a random walk.  This is not to say the market moves up or down for purely random reasons.  Looking backward, it is usually possible to identify sound economic reasons for particular historical movements of stock prices (notwithstanding the occasional speculative bubble like the dot-com craze of the late 1990's).  But looking forward, expectations of future returns are already factored into today's prices, just so that future returns will be average if today's expectations are exactly met.  To the extent that future returns deviate from the average, it will be due to the occurrence of unanticipated events, which by definition are random deviations from today's expectations.

The debate between "technicians" and "random-walkers" has taken a new turn in recent years with the emergence of behavioral finance as a lively field of study.  Behavioral finance studies markets from the perspective that investors are not rational in the expected-utility-maximizing sense of classical finance theory, as laboratory experiments have convincingly shown, which suggests that psychological theories might be helpful in explaining some of the puzzles that are observed in markets.  On this view, reading stock charts might help to anticipate the patterns that other people are likely to read into stock charts.  (The idea that markets are better predicted by psychological analysis than by rational economic calculations, analogous to guessing the winner of a beauty contest, actually dates back to John Maynard Keynes.)  But on the other hand, behavioral research has also convincingly demonstrated that people tend to misperceive random sequences as non-random--the so-called "hot hand in basketball" phenomenon.  For more discussion of the random walk hypothesis and its implications for investing, see the marvelous book A Random Walk Down Wall Street by Burton Malkiel (latest edition).  (Return to top of page.)


More general random walk forecasting models:  For purposes of time series forecasting, three versions of the random walk model can also be distinguished.  RW model 1 (the basic geometric random walk model illustrated above and implementable in Statgraphics) assumes that stock returns in different periods are statistically independent (uncorrelated) and also identically distributed--i.e. the market has constant volatility.  The only parameters to estimate are the average period-to-period return (the constant term in the ARIMA(0,1,0) model) and the volatility (the so-called white noise standard deviation in the ARIMA(0,1,0) model, which is just ARIMA jargon for root-mean-squared error).  RW model 2 assumes that returns in different periods are statistically independent but not identically distributed--for example, the volatility might change deterministically over time or it might depend on the current price level, which would require more parameters to specify.  RW model 3 assumes that returns in different periods are uncorrelated but not otherwise independent--e.g., the volatility in one period might depend on the volatility in recent periods.  The ARCH (autoregressive conditional heteroscedasticity) and GARCH (generalized ARCH) models assume that the local volatility follows an autoregressive process, which is characterized by sudden jumps in volatility with a slow reversion to an average volatility.  The autoregressive behavior of volatility is clearly evident in plots of returns over long periods, particularly high-frequency returns such as daily returns.   Robert Engle received the Nobel Prize in Economics in 2003 for developing the ARCH model, and it is commonly used by econometricians, although it is not available in Statgraphics.  Subjective estimates of time-varying volatilities are also revealed by prices of derivative securities such as options:  the "implied volatility" of a stock price at a given point in time can be determined from stock options prices using an options pricing model such as the Black-Scholes model. 

What does the empirical data show?  Besides time-varying volatility, there is some evidence for positive autocorrelation ("momentum") in some stock price data, particularly at high frequencies (e.g. daily).   The history of daily returns on the S&P500 index and the Dow Jones Industrial Average over the last 50 years shows a slight but technically significant positive autocorrelation at lag 1, although this pattern seems to have faded in recent decades.   (Perhaps it self-destructed?)   Other patterns have also been observed to arise in some decades and fade away in others.  The behavior of the CRSP index (a common benchmark for econometricians) is even more remarkable:  daily returns on the CRSP value-weighted index and equal-weighted index displayed lag-1 autocorrelations in the range of 30% to over 40% over the period 1962-1978, and slightly less over the larger period 1962-1994.  This is remarkable not only for the very large magnitude of the autocorrelations but also for the fact that returns on individual stocks in the sample typically displayed no autocorrelation.   One explanation of this puzzle is that there appear to be significant cross-correlations (also called "cross-auto-correlations") between different segments of the market--e.g., some stocks tend to lead or lag others, as though information diffuses across the market from one day to the next, even though individually they behave like random walks.  Thus, for example, stocks A and B could both have zero autocorrelation at lag 1, but A could have a significant lag-1 cross-correlation with stock B, and a portfolio formed of A plus B could therefore have significant autocorrelation at lag 1.  The question remains whether this pattern is (or was) significant enough to present opportunities for excess profits.  Transaction costs make it difficult to profit from trades in large numbers of stocks at once, unless the index is itself a traded asset.  If there were an asset pegged to the same CRSP index used in these studies, the autocorrelation pattern might have been expected to self-destruct.   So, there are some predictable patterns in stock prices (particularly at the aggregate level, and seen in hindsight), but the returns on individual assets that can be traded with low transaction costs are in reasonable agreement with RW model 3, if not one of the more restrictive models.  For more information (at a much higher technical level), see The Econometrics of Financial Markets by John Campbell, Andrew Lo, and A. Craig MacKinlay.  (Return to top of page.)