Lecture · 20–30 min

p-Hacking & the Multiple-Comparisons Trap

Test enough signals and some will look brilliant by pure chance. This is the single most important reason backtests lie — and exactly what the ConvexPi Lab’s out-of-sample grading is built to catch. We manufacture the illusion, measure it, and then fix it.

Multiple comparisonsBonferroni / FDROut-of-sample validationOverfitting ratio

Key takeaways

1.Searching many signals inflates false positives — ≈ 5% of pure-noise signals “work” at p < 0.05.
2.Correct for the search (Bonferroni / FDR): your effective p-value depends on how many things you tried.
3.Out-of-sample is the ultimate judge — a signal selected for in-sample fit reverts to noise OOS.
4.This is why the Lab grades on hidden data and reports an overfitting ratio.

Put it into practice

This lecture underpins Mission 1 and Mission 3. Read it, then go earn (or fail to earn) an out-of-sample Sharpe on the leaderboard.

Run the notebook Go to the missions →