ConvexPi

A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix

Whitney K. Newey, Kenneth D. West

Econometrica · 1987 · 17114 citations

Factor Zoo
Community wiki✎ Edit⟲ History

A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix


Source: Newey, W. K. & West, K. D. (1987) · Econometrica 55(3), 703–708 · DOI: 10.2307/1913610


Problem it solves

GMM and regression inference needs a consistent estimate of the long-run (frequency-zero) covariance matrix S of the moment conditions / score. The simplest truncated estimator — a sum of the sample autocovariances out to lag m — is consistent but need not be positive semi-definite in finite samples, so estimated variances and t-statistics can come out negative. Ordinary (and even White heteroskedasticity-robust) standard errors are also wrong when errors are serially correlated, which is pervasive in finance with overlapping observations or persistent predictors; uncorrected, naive t-statistics overstate significance and manufacture spurious predictability.


The method

Estimate the long-run variance as a weighted sum of autocovariances up to a chosen lag (bandwidth) m, using modified Bartlett (triangular) weights that decline linearly with lag:


  • Ŝ_T = Ω̂₀ + Σ_{j=1}^{m} w(j,m)·(Ω̂ⱼ + Ω̂ⱼ′), with w(j,m) = 1 − j/(m+1),

  • where Ω̂ⱼ is the j-th sample autocovariance of the moment terms. This is numerically 2π times a spectral-density-at-zero estimator smoothed with the Bartlett window. The triangular weighting guarantees Ŝ_T is positive semi-definite by construction (the paper's Theorem 1, proved via positive-semi-definiteness of the Bartlett-weighted autocovariance matrix). The bandwidth m must grow slowly with sample size T for consistency (Theorem 2). The resulting "HAC" standard errors give valid inference.


    Assumptions & inputs

  • Inputs: the moment/score series (e.g., x·residual in a regression) and a bandwidth m.
  • Requires the moments to have the mixing/moment conditions for a CLT; results are large-sample (asymptotic).
  • For a single regression coefficient this delivers the familiar Newey–West standard errors.

  • How to use it

  • The default fix for inference in time-series and panel finance regressions: predictive regressions, long-horizon/overlapping returns, and the second stage of Fama–MacBeth.
  • Choose the lag length; a common rule of thumb is m ≈ 4(T/100)^{2/9} or data-driven choices (Andrews 1991; Newey–West 1994).
  • Directly relevant to honest backtest evaluation — uncorrected autocorrelation is a classic way regressions look significant when they are not. Pairs conceptually with the multiple-testing / deflated-Sharpe literature.

  • Limitations & pitfalls

  • Results depend on the bandwidth/lag choice: too few lags under-corrects, too many is noisy (Andrews 1991 gives data-driven selection).
  • A large-sample correction; small samples can still mislead, and HAC errors can be downward-biased with highly persistent regressors.
  • The triangular kernel is not the most efficient (the quadratic-spectral kernel is optimal in Andrews 1991), but it is simple and guaranteed PSD.

  • Key references

  • Newey, W. & West, K. (1987) — A Simple, Positive Semi-Definite, HAC Covariance Matrix — Econometrica
  • White, H. (1980) — A Heteroskedasticity-Consistent Covariance Matrix Estimator — Econometrica
  • Hansen, L. (1982) — Large Sample Properties of Generalized Method of Moments Estimators — Econometrica
  • Andrews, D. (1991) — Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation — Econometrica


  • Provenance: verified/generated from the paper's full text.


    Community-maintained wiki — anyone can suggest an edit or view its revision history. Not peer-reviewed; verify claims against the original paper.

    Wiki last updated: June 23, 2026