ConvexPi

Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards

Werner Antweiler, Murray Z. Frank

The Journal of Finance · 2004 · 2397 citations

Low Vol
Community wiki✎ Edit⟲ History

Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards


Source: Antweiler, W. & Frank, M. Z. (2004) · Journal of Finance 59(3), 1259–1294


TL;DR

An early, careful study of online retail chatter as a market signal. The authors classify 1,559,621 messages posted during the full year 2000 on Yahoo! Finance and Raging Bull about 45 companies (the 30 Dow Jones Industrial Average firms plus 15 Dow Jones Internet Index firms), using computational-linguistics text classification. Talk is not just noise: message posting helps predict volatility, and disagreement among posters is associated with more trading. Effects on returns are statistically significant but economically tiny (and negative/short-lived).


The idea

Press reports and the SEC's prosecutions suggested message boards could move markets. Three questions: (1) does message level or bullishness predict returns? (2) is disagreement associated with more trades (Harris–Raviv 1993) or none (Milgrom–Stokey no-trade theorem)? (3) does posting predict volatility (posters as "noise traders")? Methodologically, the paper is a template for turning unstructured posts into a quantitative sentiment signal.


How it is measured

  • Classify each message as BUY / HOLD / SELL with Naive Bayes (via the Rainbow package), cross-checked against a support-vector machine — both give similar results, so Naive Bayes is reported. Trained on 1,000 hand-classified messages, then applied out-of-sample to all 1,559,621.
  • Build a daily bullishness index B from the buy/sell counts and an agreement index A ∈ [0,1] (low A = high disagreement/dispersion of opinion).
  • Wall Street Journal news stories used as controls; volatility modeled with a fractionally integrated realized-volatility news-response function (Andersen et al. style), plus GARCH/EGARCH/GJR robustness in an appendix.

  • Evidence

  • Returns: small negative next-day predictability from heavy posting, even after controlling for bid–ask bounce — statistically significant but economically negligible versus plausible transaction costs.
  • Trading volume / disagreement: consistent with Harris–Raviv, disagreement (low agreement index) is associated with more trades contemporaneously; there is a next-day reversal (volume lower than it otherwise would be).
  • Volatility: message posting helps predict volatility; the effect of bullishness or disagreement on volatility itself is weak.

  • Why it matters

    A foundational reference for text/sentiment-as-data in finance — it established that informal online text carries measurable information about volume and volatility (less so returns), and it laid out the supervised text-classification pipeline (label → train → classify → aggregate to a daily index) that became standard for alternative-data signals (Tetlock 2007; Da–Engelberg–Gao 2015).


    Caveats

  • Message-board populations are unrepresentative and gameable (pump-and-dump, spam); the sample is a single year (2000) of large-cap and internet stocks.
  • Return effects are weak and short-horizon; transaction costs erode any tradable edge.
  • Results depend on classifier quality and the hand-labeled training set; bag-of-words/Naive Bayes is superseded by later embedding/LLM methods.

  • Key references

  • Antweiler, W. & Frank, M. (2004) — Is All That Talk Just Noise? — Journal of Finance
  • Tetlock, P. (2007) — Giving Content to Investor Sentiment — Journal of Finance
  • Da, Z., Engelberg, J. & Gao, P. (2015) — The Sum of All FEARS — Review of Financial Studies
  • Harris, M. & Raviv, A. (1993) — Differences of Opinion Make a Horse Race — Review of Financial Studies


  • Provenance: verified/generated from the paper's full text.


    Community-maintained wiki — anyone can suggest an edit or view its revision history. Not peer-reviewed; verify claims against the original paper.

    Wiki last updated: June 22, 2026