ConvexPi

A Forecast Comparison of Volatility Models: Does Anything Beat a GARCH(1,1)?

P. R. Hansen, Asger Lunde

2005 · 1825 citations

Low Vol
Community wiki✎ Edit⟲ History

A Forecast Comparison of Volatility Models: Does Anything Beat a GARCH(1,1)?


Source: Hansen, P. R. & Lunde, A. (2005) · Journal of Applied Econometrics 20(7), 873–889 · DOI 10.1002/jae.800


TL;DR

Compares 330 ARCH/GARCH-type models out of sample on one-day-ahead conditional-variance forecasts. For DM-$ exchange rates, nothing reliably beats the simple GARCH(1,1); for IBM equity returns, GARCH(1,1) is clearly inferior to models that allow a leverage effect (asymmetry), with the best overall performance from the A-PARCH(2,2) of Ding, Granger & Engle (1993). A landmark demonstration that complexity helps only when it adds the right feature — here, asymmetry — and that the test you use matters.


Problem it solves

The ARCH/GARCH literature spawned hundreds of model variants. The paper asks, honestly and with multiple-comparison control, whether any of them beat a parsimonious GARCH(1,1) benchmark out of sample.


The method

  • Estimate 330 models spanning the ARCH/GARCH family on daily DM-$ exchange rate data and daily IBM returns; IBM forecasts are evaluated against a new realized-variance proxy built from intraday data.
  • Evaluate one-day-ahead forecasts using six different loss functions (e.g., MSE, QLIKE, R2LOG), substituting realized variance for the latent conditional variance.
  • Benchmark all 330 models against GARCH(1,1) using the Superior Predictive Ability (SPA) test of Hansen (2001) and White's (2000) Reality Check (RC), both of which control for data-snooping across the full model set.

  • Assumptions & inputs

  • Requires a volatility proxy; the realized-variance proxy (vs. noisy squared returns) is essential for the IBM analysis.
  • Conclusions are conditioned on these two series, the estimation/evaluation split, and the choice among the six loss functions.

  • How to use it

  • As a model-selection discipline: justify added volatility-model complexity with multiple-comparison-aware tests, not in-sample fit.
  • Findings to carry forward: asymmetry matters for equities, not for FX; the best overall model was A-PARCH(2,2).
  • Use SPA over the Reality Check — the paper shows the RC lacks power: it fails to detect that GARCH(1,1) is significantly outperformed on IBM and even suggests an ARCH(1) might not be beaten, whereas the SPA test always rejects ARCH(1).

  • Limitations & pitfalls

  • Results depend on the realized-volatility proxy and the loss function used.
  • Conclusions pertain to these two series and sample periods; generalization requires care.
  • A no-rejection for FX is "no evidence against GARCH(1,1)," not proof of its optimality.

  • Key references

  • Hansen, P. & Lunde, A. (2005) — A Forecast Comparison of Volatility Models — Journal of Applied Econometrics
  • Hansen, P. (2005) — A Test for Superior Predictive Ability — Journal of Business & Economic Statistics
  • Glosten, L., Jagannathan, R. & Runkle, D. (1993) — On the Relation between the Expected Value and the Volatility of the Nominal Excess Return on Stocks — Journal of Finance



  • Provenance: verified/generated from the paper's full text.


    Community-maintained wiki — anyone can suggest an edit or view its revision history. Not peer-reviewed; verify claims against the original paper.

    Wiki last updated: June 22, 2026