Backtest Pitfalls Case Study

An earlier post in this series listed five backtest pitfalls as one-liners in a table. Two of them — look-ahead bias and survivorship bias — sit apart from the others. They warp results in a single direction, namely the direction that inflates CAGR.

These are not two-sided noise but systematic bias, so missing them means carrying the inflation straight into live trading. Walk-forward analysis, covered in the companion post, does not catch them either. Walk-forward only partitions time, so future information leaked into the data, or the survivors-only universe, flows untouched into every fold. The patterns only become clear at the case level.

Look-ahead Bias

This is any situation where a decision at time t uses information from time t+k. Accidental future leakage shows up in subtle forms in backtests.

Case 1. Full-Period Momentum Normalization

Consider a strategy that ranks securities by z-scored momentum and picks the top N. If the z-score at time t uses the mean and standard deviation of the full period, those statistics carry information from t+k. The score at t silently reflects data that was not yet available.

The fix is straightforward. Compute the mean and standard deviation on a rolling window that ends at t. Nothing from after t enters the normalization.

Case 2. Ignoring Financial Disclosure Lag

A fiscal year ending 2024-12-31 is typically disclosed in March 2025. Using that statement in a backtest decision dated January 2025 means consuming information that had not yet been published.

The fix is to model the disclosure lag explicitly. Sources like DART or SimFin expose both the period end and the actual filing date as separate columns. The backtest only uses the statement after the filing date.

Case 3. Close-Signal, Close-Fill

A common shortcut is to compute a signal on the closing price (moving-average cross, RSI) and assume execution at the same close. In reality, by the time the close has printed, orders can no longer be placed at that price. The realistic fill is the next-day open or VWAP.

Close-signal-close-fill misses the next-day gap effect entirely, effectively assuming an entry price that is better than what would have happened. The fix is to fill at the next session’s open or VWAP, or keep the close fill and add slippage.

Survivorship Bias

This happens when a backtest runs only on names that survived to today. Delisted names are usually the losers, and removing them inflates CAGR.

Case 1. S&P 500 Reconstitution Over Time

The S&P 500 is not a fixed roster. Somewhere between ten and twenty-odd names are replaced each year. Over a decade, well over a hundred names have come and gone.

“Backtesting on S&P 500 names over 10 years” depends on which year’s roster is used. Running today’s 500 against 10-year-old prices already filters to survivors. The 10-year-old roster should include names that have since been delisted, acquired, or otherwise removed.

The fix is to use a point-in-time index composition source — Compustat, CRSP, and similar. Free APIs do not carry that information, so where commercial data is out of reach, the bias has to be acknowledged and results read conservatively.

Case 2. Limits of Yahoo Finance and Free APIs

Yahoo Finance, Korea’s KIS API, Naver quotes, and similar free sources rarely keep delisted names. Once a ticker is delisted, the symbol disappears or price queries return nothing.

The Korean market behaves the same way. Delisted KOSPI names lose their historical price data in free sources. A “10-year KOSPI factor backtest” on free data therefore carries built-in survivorship bias.

There are two ways out. Buying commercial data is the direct fix; where the budget does not allow it, results need to be read with a correction margin. Academic estimates of CAGR inflation from survivorship bias often cite 2–4% per year.

Adjacent Pitfalls

Two related biases tend to sit next to these.

Data snooping is finding a “good” combination by trying enough parameter combinations. Ten signal candidates × ten lookbacks × ten thresholds means 1,000 runs, and a few will look strong by pure chance. The companion post on walk-forward analysis partly mitigates this, but does not fully solve it.

Selection bias is reporting only the periods where the strategy looks good. Showing 2010–2020 and omitting 2008 changes the impression of the same strategy. The statistical strength of the claim drops.

Avoidance Checklist

Run through this list before running a new backtest or when reading existing results.

Does the signal calculation avoid using information from after t?
Are financial statements lagged to their actual filing dates?
Does execution happen at a time different from the signal time (a time when orders can actually be placed)?
Is the universe point-in-time, or is today’s roster being applied to past data?
Are transaction costs and slippage included in the simulation?

Some items cannot be resolved fully under free-data constraints. In those cases, marking the limitation explicitly and adding a margin to result interpretation is the next-best option.

Once these biases are named, they can be avoided. With free-data setups, survivorship bias in particular is hard to remove completely, and conservative reading of results is the rational stance. Combined with walk-forward analysis from the companion post, the reliability of a backtest gains a second layer. Walk-forward handles the time split; the checklist above handles data integrity.

References

Investopedia — Look-Ahead Bias
Investopedia — Survivorship Bias
Investopedia — Data Snooping Bias
Marcos López de Prado, Advances in Financial Machine Learning (2018)
Bailey, D., López de Prado, M. (2014). “The Probability of Backtest Overfitting”

Look-ahead Bias#

Case 1. Full-Period Momentum Normalization#

Case 2. Ignoring Financial Disclosure Lag#

Case 3. Close-Signal, Close-Fill#

Survivorship Bias#

Case 1. S&P 500 Reconstitution Over Time#

Case 2. Limits of Yahoo Finance and Free APIs#

Adjacent Pitfalls#

Avoidance Checklist#

References#