Dissecting Characteristics Nonparametrically

Source: Freyberger, J., Neuhierl, A. & Weber, M. (2020). Review of Financial Studies 33(5), 2326–2377. (NBER WP #23227, 2017)

TL;DR

Asks which firm characteristics provide independent, incremental predictive power for the cross-section of returns — allowing for nonlinear shapes — using an adaptive group LASSO for simultaneous model selection and nonparametric estimation. Out of 36 candidates, only a small set survives (about eight in the first half of the sample), many popular predictors are subsumed, and nonlinearities matter: the nonparametric model raises out-of-sample Sharpe ratios by roughly 50% over a linear panel regression.

Problem it solves

Addresses Cochrane's (2011) "multidimensional challenge" — with 300+ published factors, which characteristics carry independent information and which are subsumed? Portfolio sorts suffer the curse of dimensionality with many characteristics, and Fama–MacBeth linear regressions impose strong functional form and are outlier-sensitive. This method lets the data choose both which characteristics enter and their functional shape.

The method

Model expected returns as an additive nonparametric function of characteristics, approximating each characteristic's effect with a quadratic spline (a group of basis terms per characteristic).

Apply the adaptive group LASSO (Huang, Horowitz & Wei 2010): the group penalty drops a characteristic's entire spline block (selection), while remaining blocks estimate the nonlinear shape; "adaptive" weights give consistent selection.

Tune via grids of knots (4, 9, 14, 19); more knots imply a heavier per-characteristic penalty, so fewer characteristics are selected.

Insensitive to outliers (rank-transform characteristics to [−1,1]); validated out of sample via rolling estimation and one-month-ahead hedge portfolios.

Assumptions & inputs

Inputs: a panel of firm characteristics (cross-sectionally ranked), one-month-ahead returns, knot grid, penalty chosen by validation.

Data: 36 characteristics (size, book-to-market, beta, momentum, etc.), U.S. stocks, July 1963 – June 2014.

Additive separability across characteristics (no built-in interactions); approximate sparsity for LASSO consistency.

How to use it / findings

In-sample, only eight characteristics are significant in the first half of the sample; 17 in the second half; for large stocks (above the 20% NYSE size cut) only seven remain (in-sample Sharpe 1.81).

Out of sample (rolling, predictions from January 1991), the nonparametric hedge portfolio earns an average Sharpe ratio of 3.42 vs 2.26 for the linear model — the ~50% improvement.

The nonparametric model selects far fewer characteristics (8 vs the linear model's 21, or 27 in a later window) yet predicts better OOS: the dense linear model overfits in sample.

Several effects are materially nonlinear, which linear cross-sectional regressions miss; predictive power also varies substantially over time.

Limitations & pitfalls

Additive structure rules out the interaction effects later captured by trees/neural nets.

Selection is sensitive to the candidate set, the knot grid, and the validation scheme.

Quadratic-spline shapes are smooth approximations; estimated functions can be noisy in sparse regions of a characteristic.

Key references

Freyberger, J., Neuhierl, A. & Weber, M. (2020) — Dissecting Characteristics Nonparametrically — Review of Financial Studies

Huang, J., Horowitz, J. & Wei, F. (2010) — Variable Selection in Nonparametric Additive Models — Annals of Statistics

Gu, S., Kelly, B. & Xiu, D. (2020) — Empirical Asset Pricing via Machine Learning — Review of Financial Studies

Provenance: verified/generated from the paper's full text.