ConvexPi

Shrinking the Cross Section

Serhiy Kozak, Stefan Nagel, Shrihari Santosh

SSRN Electronic Journal · 2020 · 738 citations

Factor Zoo
Community wiki✎ Edit⟲ History

Shrinking the Cross-Section


Source: Kozak, S., Nagel, S. & Santosh, S. (2020). Journal of Financial Economics 135(2), 271–292. (NBER WP #24070, 2017)


TL;DR

Estimates the stochastic discount factor (SDF) from a large set of characteristic-based factors using economically-motivated shrinkage — a Bayesian prior that penalizes SDFs implying implausibly high (near-arbitrage) Sharpe ratios. The central message: a characteristics-sparse SDF (a handful of anomalies/factors) cannot summarize the cross-section, but a low-dimensional SDF built from a few leading principal components of the candidate factors can. The zoo is low-rank, not sparse.


Problem it solves

Empirical asset pricing keeps expanding small factor models (FF3 → q4 → FF5 → 6 factors) as new anomalies appear, but these are tested only against a few portfolios. With dozens to hundreds of characteristics, estimating the SDF's loadings is a high-dimensional problem where OLS overfits and naive variable selection ("which 3 anomalies matter?") is statistically and economically unjustified (present-value/q-theory logic implies many characteristics should matter).


The method

  • Build candidate long-short factors from many characteristics; estimate SDF coefficients b that minimize pricing errors subject to an economic prior.
  • The prior penalizes the maximum squared Sharpe ratio implied by the SDF, which shrinks the contribution of low-variance principal components of the factors. Mechanically this resembles ridge (L2) regression but rooted in no-near-arbitrage rather than an arbitrary penalty.
  • Optional L1 (Lasso) sparsity and a combined L1+L2 (elastic-net) "dual-penalty" specification to test whether sparsity helps.
  • Penalty strength chosen to maximize the cross-validated cross-sectional out-of-sample R².

  • Assumptions & inputs

  • Inputs: a panel of characteristic-based factor (managed-portfolio) returns; the prior hyperparameter (root expected SR), set by cross-validation; choice of K PCs.
  • Linear SDF in the chosen factors; no-near-arbitrage motivates the prior.
  • Datasets: Fama–French 25 size/BM portfolios (1926–2016) as a sanity check; 50 well-known anomaly portfolios; ~80 portfolios from WRDS financial ratios and lagged returns; plus extremely high-dimensional extensions adding powers and interactions of characteristics.

  • How to use it / findings

  • L2-only shrinkage delivers the best (or near-best) OOS performance in the space of base characteristics — a natural default when sparsity is not required.
  • L1-only (pure Lasso) often struggles OOS in high-dimensional base-characteristic spaces; sparsity in raw characteristics is limited even with the dual penalty.
  • In PC space, a sparse SDF on a few leading PCs does very well — even one PC gets close to the maximum OOS R² on FF25; a handful suffices on the larger datasets.
  • Heavy shrinkage is essential: unregularized SDFs overfit badly; the data "call for substantial L2-shrinkage but essentially no sparsity."

  • Limitations & pitfalls

  • PCs are statistical and rotate with the input factor set; the economic prior (max-SR penalty) is a modeling choice and its hyperparameter must be tuned.
  • Linear SDF in the chosen portfolios (nonlinear/deep-learning extensions follow in later work).
  • OOS R²s are cross-sectional and evaluated over long withheld windows; magnitudes depend on the portfolio universe.

  • Key references

  • Kozak, S., Nagel, S. & Santosh, S. (2020) — Shrinking the Cross-Section — Journal of Financial Economics
  • Kozak, S., Nagel, S. & Santosh, S. (2018) — Interpreting Factor Models — Journal of Finance
  • Kelly, B., Pruitt, S. & Su, Y. (2019) — Characteristics Are Covariances — Journal of Financial Economics



  • Provenance: verified/generated from the paper's full text.


    Community-maintained wiki — anyone can suggest an edit or view its revision history. Not peer-reviewed; verify claims against the original paper.

    Wiki last updated: June 24, 2026