Backtest Performance Metrics

Once a strategy is built, it must be validated against historical data. This is backtesting. But does a high CAGR make a good strategy? What about CAGR of 30% with MDD of -50%? A single metric can mislead.

Return Metrics

CAGR

CAGR (Compound Annual Growth Rate) annualizes the total return over the investment period.

CAGR = (Final Value / Initial Value)^(1/Years) - 1

A total return of 33% over 3 years translates to a CAGR of about 10%. It enables comparison across strategies with different time horizons.

CAGR only tells you the magnitude of returns, not how volatile the path was. Two strategies with the same CAGR may have followed vastly different trajectories — one steady, the other crashing -40% before recovering.

Alpha

Alpha = Strategy Return - Benchmark Return

Alpha measures excess return over the market (benchmark). Common benchmarks include the KOSPI index or S&P 500. Positive Alpha means the strategy outperformed the market.

The goal of a quant strategy is to generate positive Alpha. Market returns can be captured by simply buying an index fund. A strategy’s value lies in the Alpha it adds on top.

Risk Metrics

MDD

MDD (Maximum Drawdown) is the largest peak-to-trough decline during the strategy’s operation.

MDD = (Peak - Trough) / Peak × 100%

It answers “how much could you lose in the worst case?” An MDD of -30% means assets fell from $1 million to $700,000 at some point.

MDD matters because of psychological limits. Even with a high CAGR, a strategy with -50% MDD is difficult to maintain in practice. Few investors can tolerate watching their assets halve.

Sharpe Ratio

Sharpe Ratio = (Strategy Return - Risk-Free Rate) / Standard Deviation of Returns

Sharpe Ratio measures return efficiency relative to risk. For the same return, lower volatility yields a higher Sharpe.

Sharpe Ratio	Interpretation
< 0	Worse than risk-free
0 – 1.0	Average
1.0 – 2.0	Good
> 2.0	Excellent

Consider a strategy with CAGR 30% and MDD -50% versus one with CAGR 15% and MDD -15%. The latter likely has a higher Sharpe Ratio. Returns are half, but risk is far lower. Return magnitude and return efficiency are different concepts.

Operational Metrics

Win Rate

Win rate is the proportion of profitable trades. A 60% win rate means 6 profitable trades out of 10.

A high win rate does not guarantee good performance if the losses are large. Conversely, a 30% win rate can still produce strong results if winning trades are large enough. The combination of win rate × average gain/loss ratio matters more than win rate alone.

Turnover

Turnover measures how frequently the portfolio is replaced.

High turnover means high transaction costs. Each buy/sell incurs commissions, and slippage (the gap between expected and actual execution price) accumulates. Backtests that ignore transaction costs overstate the performance of high-turnover strategies.

Backtesting Pitfalls

Strong backtest results do not guarantee real-world success.

Look-ahead Bias

Using future data at the current point in time. For example, quarterly earnings for March 31 are reported weeks later. Using March 31 data on March 31 means using information that was not yet available. Momentum scores must use only data available as of the rebalancing date.

Survivorship Bias

Distortion from excluding delisted stocks from the dataset. Backtesting only on surviving stocks inflates performance. In reality, you might have invested in a stock that was later delisted, resulting in losses. This bias is directly tied to data source limitations.

Overfitting

Building a strategy that fits historical data perfectly. Excessive parameter tuning produces flawless past performance but poor future results. Minimizing parameter count and validating on out-of-sample data are the standard countermeasures.

Ignoring Transaction Costs

Backtests without commissions and slippage show better results than reality. The gap is especially large for high-turnover strategies. Setting a realistic fee rate during backtesting is essential.

Ignoring Liquidity

Small-cap stocks with low trading volume appear tradeable in backtests but may not execute at desired prices in practice. Filtering out low-liquidity stocks using a minimum market cap threshold is standard practice.

Backtest performance metrics span three axes: returns (CAGR, Alpha), risk (MDD, Sharpe Ratio), and operations (Win Rate, Turnover). A single metric distorts judgment. All three axes must be examined together to assess a strategy’s true value.

Even strong backtest results may be contaminated by five pitfalls: Look-ahead Bias, Survivorship Bias, Overfitting, ignoring transaction costs, and ignoring liquidity. Reading results and questioning results must go hand in hand.

The next post will cover how to combine individual indicators into a single score and construct portfolios — factor scoring and rebalancing.

References

Investopedia — Compound Annual Growth Rate (CAGR)
Investopedia — Maximum Drawdown (MDD)
Investopedia — Sharpe Ratio
Investopedia — Overfitting
Marcos López de Prado, Advances in Financial Machine Learning (2018)

Return Metrics#

CAGR#

Alpha#

Risk Metrics#

MDD#

Sharpe Ratio#

Operational Metrics#

Win Rate#

Turnover#

Backtesting Pitfalls#

Look-ahead Bias#

Survivorship Bias#

Overfitting#

Ignoring Transaction Costs#

Ignoring Liquidity#

References#