ConvexPi

Characteristics Are Covariances: A Unified Model of Risk and Return

Bryan Kelly, Seth Pruitt, Yinan Su

2019 · 780 citations

Factor Zoo
Community wiki✎ Edit⟲ History

Characteristics Are Covariances: A Unified Model of Risk and Return (IPCA)


Source: Kelly, B. T., Pruitt, S. & Su, Y. (2019). Journal of Financial Economics 134(3), 501–524. (NBER WP #24540, 2018)


TL;DR

Introduces Instrumented Principal Components Analysis (IPCA) — a latent-factor model in which a stock's factor loadings are linear functions of its observable characteristics. The provocative claim in the title: characteristics predict returns mainly because they proxy for risk exposures (covariances/betas), not because they earn anomalous alpha. Just four IPCA factors with characteristic-driven loadings price the cross-section, leaving anomaly intercepts small and statistically insignificant.


Problem it solves

Factors and loadings in the asset-pricing Euler equation are unobservable, and the "factor zoo" offers dozens of return-predicting characteristics with no agreement on which carry independent information or correspond to risk. IPCA bridges the "characteristics" view (firm attributes predict returns) and the "covariances" view (risk exposures earn premia) in one estimable conditional factor model, and provides a formal test of whether a characteristic enters through risk (loadings) or as mispricing (alpha).


The method

  • Factor model r_{i,t+1} = α_{i,t} + β'_{i,t} f_{t+1} + ε_{i,t+1}, with time-varying loadings instrumented by characteristics: β_{i,t} = Z_{i,t} Γ_β (and optionally α_{i,t} = Z_{i,t} Γ_α).
  • Latent factors f_t and the mapping matrices Γ are estimated jointly via instrumented PCA (an alternating least-squares / managed-portfolio procedure).
  • Hypothesis test Γ_α = 0 (via a residual bootstrap, 1000 draws) asks whether characteristics carry return information beyond their loadings — i.e., whether anomaly alphas survive.

  • Assumptions & inputs

  • Sample: over 12,000 U.S. stocks, 1962–2014, with 36 firm characteristics.
  • Loadings are restricted to be linear in characteristics; the number of latent factors K is a tuning choice (K = 4 is the headline).
  • No-arbitrage / linear SDF framework; characteristics are the instruments for dynamic betas.

  • How to use it / findings

  • Four IPCA factors deliver a total R² of 19.4% (vs 21.9% for matched Fama–French five-factor) and a predictive R² of 1.8% in-sample (vs 0.3% for FF5); 0.7% per month out-of-sample.
  • The restriction Γ_α = 0 is not rejected — characteristic-associated anomaly intercepts are small and insignificant, consistent with risk compensation over mispricing.
  • Only 8 of the 36 characteristics are statistically significant in the IPCA specification, and they account for nearly 100% of the model's accuracy.
  • Out-of-sample tangency Sharpe ratio of 2.6 vs 1.3 for the observable-factor benchmark.

  • Limitations & pitfalls

  • Loadings are restricted to be linear in characteristics (relaxed later by autoencoders).
  • Latent factors are statistical, not economically labeled; results depend on the characteristic set and on K.
  • Plain PCA on the same data gives higher total R² (29.0%) but negative predictive R² — fit of variance ≠ explanation of average returns.

  • Key references

  • Kelly, B., Pruitt, S. & Su, Y. (2019) — Characteristics Are Covariances — Journal of Financial Economics
  • Kozak, S., Nagel, S. & Santosh, S. (2020) — Shrinking the Cross-Section — Journal of Financial Economics
  • Gu, S., Kelly, B. & Xiu, D. (2021) — Autoencoder Asset Pricing Models — Journal of Econometrics



  • Provenance: verified/generated from the paper's full text.


    Community-maintained wiki — anyone can suggest an edit or view its revision history. Not peer-reviewed; verify claims against the original paper.

    Wiki last updated: June 24, 2026