The false discovery rate (FDR) measures the share of false positives in a set of statistical tests. I develop simple and intuitive bounds on the FDR in cross-sectional predictability publications. The simplest bound requires just a few lines of math and finds $\text{FDR} \le 25\%$ based on summary statistics in eight out of nine previous studies. A more refined bound finds $\text{FDR} \le 9\%$. The FDR is small because randomly selecting accounting ratios produces statistically significant predictability far more often than would occur if there were no predictability. The bounds also reconcile the disparate FDR estimates in the literature.
- Paper ID: 2206.15365
- Title: Most claimed statistical findings in cross-sectional return predictability are likely true
- Author: Andrew Y. Chen (Federal Reserve Board)
- Classification: q-fin.GN (Quantitative Finance - General Finance)
- Publication Date: October 2025 (First released on SSRN: August 27, 2021)
- Paper Link: https://arxiv.org/abs/2206.15365
The false discovery rate (FDR) measures the proportion of false positives in statistical testing. This paper develops simple and intuitive FDR bounds for cross-sectional return predictability research. The simplest bound requires only a few lines of mathematical calculation and, based on summary statistics from eight of nine prior studies, finds FDR ≤ 25%. More refined bounds find FDR ≤ 9%. The small FDR arises because randomly selected accounting ratios produce statistically significant predictability at frequencies far exceeding those expected under the null hypothesis of no predictability. These bounds also reconcile disagreements between different FDR estimates in the literature.
Researchers have discovered hundreds of cross-sectional stock return predictors, a richness that raises concerns about multiple testing problems. Intuitively, if researchers conduct many tests, some tests may be statistically significant purely by chance even under the null hypothesis of no predictability.
- Multiple Testing Problem: Large numbers of factor discoveries may lead to false positive results
- FDR Estimation Disagreement: Existing literature exhibits enormous variation in FDR estimates, ranging from nearly 0% to over 45%
- Publication Bias: Statistically significant results are more likely to be published, affecting true FDR estimates
- Methodological Controversy: Different research teams using different methods reach drastically different conclusions
Accurately estimating FDR is crucial for understanding the credibility of the financial anomalies literature, directly affecting investment strategy formulation and academic research direction.
- Simple and Intuitive FDR Bounds: Proposes the "Easy Bound" method, requiring only a few lines of mathematical calculation to estimate the FDR upper bound
- Visual Bound Method: Develops "Visual Bound," providing tighter FDR bounds through histogram decomposition
- Literature Reconciliation: Unifies explanations for vastly different FDR estimates in existing literature, finding that disagreements stem primarily from interpretation differences rather than data differences
- Empirical Findings: Demonstrates that the probability of randomly selected accounting ratios producing significant predictability far exceeds theoretical expectations, providing empirical support for small FDR
Define the predictive ability of cross-sectional signal i through rˉi, typically obtained by constructing a long-short portfolio based on i and calculating the sample mean return. The null hypothesis is E(rˉi)=0.
- ti≡rˉi/SEi is the t-statistic
- Under the null hypothesis: ti∣nulli∼Normal(0,1)
- Discovery definition: ∣ti∣>2 (corresponding to 5% significance level)
- FDR definition: FDR∣t∣>2≡Pr(nulli∣∣ti∣>2)
Applying Bayes' rule yields:
FDR∣t∣>2=Pr(∣ti∣>2)Pr(∣ti∣>2∣nulli)Pr(nulli)≤Pr(∣ti∣>2)5%
This bound is intuitively straightforward: if the tail probability under the null hypothesis (numerator) cannot explain the observed tail probability (denominator), then FDR must be small.
Tightens the bound by estimating Pr(nulli) from data:
Pr(∣ti∣<0.5)≥(0.38)Pr(nulli)
Combining yields a tighter bound:
FDR∣t∣>2≤[Pr(∣ti∣>2)5%][0.38Pr(∣ti∣<0.5)]
- Uses data mining studies as worst-case scenarios
- Estimates the distribution of unpublished results through conservative extrapolation
- Avoids direct dependence on published literature statistics
Decomposes the t-statistic histogram into null and alternative components:
Pr(∣ti∣∈b)=Pr(∣ti∣∈b∣nulli)Pr(nulli)+Pr(∣ti∣∈b∣alti)Pr(alti)
Estimates the FDR upper bound by constraining the null component to not exceed the data component.
- Plot the histogram of ∣ti∣ for data mining signals
- Plot the maximum null distribution histogram that still fits the data interior
- Draw a vertical line at 2.0; the ratio of null area to data area to the right of this line estimates the FDR bound
- Data Mining Studies:
- Yan and Zheng (2017): 18,000 accounting ratios
- Chordia, Goyal, and Saretto (2020): approximately 200 accounting variables
- Chen, Lopez-Lira, and Zimmermann (2025): 29,000 signals
- Meta-Research Data:
- Green, Hand, Zhang (2013)
- Chen, Zimmermann (2020): 77 published predictive factors
- Harvey, Liu, Zhu (2016)
- McLean, Pontiff (2016)
- Jensen, Kelly, Pedersen (2021)
- Jacobs, Muller (2020)
- FDR Bounds: Upper bound estimates of false discovery rate
- Significance Proportion: Proportion of signals with ∣ti∣>2
- Small t-statistic Proportion: Proportion of signals with ∣ti∣<0.5
- Uses equal-weighted and value-weighted portfolios
- Considers different factor model adjustments (CAPM, FF3, FF3+momentum)
- Employs Fama-French clustered bootstrap for standard error calculation
Based on eight of nine studies, FDR ≤ 25%:
- At least 20% of accounting ratios in data mining studies produce ∣ti∣>2
- Applying the formula yields: FDR∣t∣>2≤5%/0.20=25%
More precise estimates using CLZ data:
- Of 29,000 signals, 9,700 satisfy ∣ti∣>2, and 6,300 satisfy ∣ti∣<0.5
- Yields: FDR∣t∣>2≤8.5%, meaning at least 91.5% of findings are true
| Weighting | Factor Adjustment | FDR Upper Bound | Significance Proportion |
|---|
| Equal-weighted | Raw returns | 8.6% | 32.7% |
| Equal-weighted | FF3 | 7.3% | 34.9% |
| Value-weighted | CAPM | 19.0% | 17.9% |
| Value-weighted | FF3+momentum | 41.7% | 10.5% |
- Weighting Impact: Value-weighting significantly reduces significance proportion and increases FDR bounds
- Factor Adjustment Impact: FF3+momentum adjustment has the largest effect on value-weighted portfolios
- Dataset Robustness: Data mining results from three independent research teams are consistent
- Harvey, Liu, Zhu (2016): Reinterprets findings to show FDR of only 12%, contrary to the original claim that "most findings are false"
- Harvey and Liu (2020): The 0.1% of "true" strategies actually corresponds to selecting the most extreme value-weighted FF3+momentum specification
- Chordia, Goyal, Saretto (2020): The 45% FDR estimate stems from ignoring information about small t-statistics in calibration
- Benjamini and Hochberg (1995): Classical FDR control methods
- Storey (2002): Direct FDR estimation methods
- Sorić (1989): Earliest FDR concepts
- Green, Hand, Zhang (2013): Survey of cross-sectional return prediction
- McLean and Pontiff (2016): Out-of-sample decay studies
- Chen and Zimmermann (2022): Open-source cross-sectional asset pricing
- Harvey, Liu, Zhu (2016): Multiple testing problems in financial economics
- Chen (2024): Discussion on whether t-statistic thresholds need to be raised
- Small FDR: At least 75% of claimed findings in cross-sectional predictability literature are true (FDR ≤ 25%)
- More Precise Estimates: Considering information about small t-statistics, at least 91% of findings are true (FDR ≤ 9%)
- Literature Reconciliation: Different FDR estimates stem primarily from interpretation differences rather than data or methodological differences
- Empirical Support: High significance rates of random accounting ratios provide direct evidence for small FDR
- Statistical vs. Economic Significance: "True findings" refer only to statistical significance and non-zero alpha, not considering transaction costs, information costs, and other economic factors
- Out-of-Sample Performance: Statistical truth does not equate to economic feasibility
- Structural Changes: Insufficient consideration of market structure changes' impact on predictability
- Data Mining Assumptions: Assumes the research process does not produce higher false discovery rates than random data mining
- Economic Significance: Combine transaction costs and market frictions to assess economic value
- Dynamic FDR: Consider time-varying predictability and market conditions
- Causal Inference: Extend from predictive relationships to causal relationships
- Machine Learning Methods: FDR control in high-dimensional settings
- Method Simplicity: The Easy Bound method is extremely simple, requiring only summary statistics for calculation
- Strong Intuitiveness: Visual Bound provides intuitive histogram decomposition explanations
- Empirical Robustness: Based on consistent results from multiple independent research teams
- Literature Contribution: Successfully reconciles long-standing disagreements in FDR estimates
- Solid Theory: Based on fundamental probability principles with rigorous mathematical derivations
- Conservative Bounds: Bound methods may be overly conservative; true FDR may be smaller
- Independence Assumptions: While claiming not to require independence, correlation still affects estimation precision
- Data Dependence: Results depend on the quality and representativeness of specific data mining studies
- Temporal Stability: Insufficient discussion of FDR changes over time
- Economic Interpretation: Lacks in-depth discussion of the relationship between statistical and economic significance
- Academic Value: Provides important statistical credibility assessment for financial anomalies literature
- Practical Significance: Offers investors and regulators reference points for factor validity
- Methodological Contribution: Simple and effective FDR bound methods can be generalized to other fields
- Policy Impact: Influences understanding of financial market efficiency and anomaly persistence
- Academic Research: Assessing statistical credibility of newly discovered factors
- Investment Practice: Screening investment strategies with statistical support
- Regulatory Policy: Evaluating systematic risk of market anomalies
- Risk Management: Understanding the statistical foundation of factor exposures
This paper cites 22 important references covering core and cutting-edge research in FDR methodology, financial anomaly discovery, and multiple testing control, providing a solid theoretical foundation and empirical support for the research.
Overall Assessment: This is an important contribution to the field of financial econometrics, solving a long-standing controversial issue through elegant and simple methods, providing new perspectives and tools for understanding the statistical credibility of financial anomalies literature.