Best Practices

6 mistakes that most often inflate an Edge Score

An Edge Score grades the trades you upload, so any backtest mistake that pumps up your stats pumps up the score with them. The six that do it most: leaving out trading costs, overfitting, testing only in sample, cherry picking the date range, lookahead bias, and picking the best of many tries. Each one lifts a metric the score reads.

Last updated: 2026-06-10

How does leaving out trading costs inflate an Edge Score?

Costs come straight out of every trade, so a backtest that ignores them reports a bigger average win than you will ever collect. Commissions, spread, and slippage each shave a slice off your result. Drop them and your EV per trade reads high, which feeds two of the score's inputs at once: Edge Magnitude and Edge Velocity.

Quantprove scores the numbers in your file. It cannot tell whether you already subtracted costs, so a gross trade log scores higher than a net one for the same strategy. Export your trades with commissions, spread, and slippage already taken out, then upload. The score that comes back is one you can actually trade.

Your broker keeps the difference either way.

How does overfitting make a backtest look better than its real edge?

Overfitting is tuning a strategy until it fits the past in detail it will not repeat. Every extra parameter you tweak to lift the backtest buys a higher EV and a smoother equity curve in sample, which raises Edge Magnitude and Edge Consistency together. The gain is borrowed from the exact history you fit, so it does not carry forward.

The tell is a backtest that looks far better than anything you trade live. A score built on an overfit run is real for that history and hollow going forward. The check is out of sample data: hold back a slice of trades the tuning never saw, or run the strategy through Validation, where Quantprove scores it on trades outside the backtest.

The market never saw your tuning and will not honor it.

Why does testing only in sample overstate your edge?

A backtest is in sample by definition: you built the strategy on the same data you are scoring. The Edge Score grades that history honestly, but history you fit on tends to flatter you, because you stopped adjusting once it looked good. The score reflects that stopping point, not a fresh test.

An Edge Score is a backtest grade, not proof the edge survives. It earns its weight when the same edge shows up on trades the strategy never saw. Keep a holdout set from the start, or move a strong Edge Score into Validation and read the Stability Score before you trust it.

How does cherry picking the date range inflate the score?

Markets move through regimes, and a strategy tuned to one of them shines inside that window and fades outside it. Start your backtest at the foot of a trend your system loves and the run looks clean: steady months, shallow drawdowns, high Edge Velocity. Slide the window and the same strategy can turn average.

A date range chosen to flatter the strategy inflates Edge Consistency most, because the monthly results look stable inside a regime that suited it. Test across several years and more than one market condition. If the score holds across calm and rough stretches, it means something.

If the score only holds in one window, the window is doing the work.

How does lookahead bias leak into a backtest?

Lookahead bias is using information a trade could not have known yet. Filling at the day's close when your signal needs that same close to fire, using a revised data point that arrived later, or sizing on a figure published after the entry all hand the backtest a peek at the future. Win rate and EV both climb on knowledge you would not have had live.

This one hides well, because the equity curve looks legitimate. Walk one suspect trade through real time: at the moment of entry, was every input already published. If the answer is no, the edge is partly built on hindsight, and the Edge Score is reading numbers you cannot reproduce.

Why does picking the best of many strategies inflate the result?

Run fifty variations and one will top the list by luck alone. Pick that winner and you have selected for a lucky run, not a real edge. The backtest of the best of fifty looks excellent, so its Edge Score does too, even when none of the fifty holds a true edge.

The more combinations you try, the higher the best score climbs on noise. Decide your rules before you test, and count how many variants you ran. If the strong score only appeared after a wide search, treat it as a candidate to validate, not a verdict. Out of sample trades and a Stability Score sort the real winner from the lucky one.

Your best backtest is the luckiest one you kept.

Which Edge Score input does each mistake inflate?

Each mistake pushes on a specific part of the score, and each has a check that brings the number back to the real edge.

MistakeEdge Score input it liftsHow to catch it
Costs left outEV per trade, Edge VelocityUpload net trades, costs already removed
OverfittingEdge Magnitude, Edge ConsistencyHold out a sample, validate out of sample
In sample onlyThe whole scoreKeep a holdout set, read the Stability Score
Cherry picked windowEdge ConsistencyTest across several years and regimes
Lookahead biasWin rate, EV per tradeReplay a trade in real time, check inputs were published
Best of manyThe whole scoreFix rules first, validate the winner

One source the score guards against on its own is sample size. A confidence multiplier scales the score down when the trade count is low and reaches full weight at 500 trades, so a thin backtest cannot post a full number. The six mistakes above live in the trade data itself, which the multiplier does not touch. How many trades you need to validate a strategy covers the sample size math.

An Edge Score is honest about the trades you give it. It cannot see the trade you stripped of costs, the parameter you tuned to the past, or the window you chose because it looked good. Fix those at the source and the score settles to the truth, which is the number worth having. One caveat: a backtest that passes every check here can still fail live, because clean inputs buy you a fair test, not a verdict. To go deeper on telling a real edge from a lucky streak, see real edge or lucky streak, then read your Edge Score on trades you can stand behind.

Frequently asked questions

No. An Edge Score grades the statistical edge in a backtest, not your future profit. A strong score on a clean, well sized backtest is a green light to validate the strategy, not a guarantee. Profit depends on trading it live with the same costs and conditions the score was built on.
Yes. This is the one inflation source the score guards against on its own. A confidence multiplier scales the score down when the trade count is low and reaches full weight at 500 trades, so a thin backtest cannot post a full score. The mistakes in this article live in the trade data itself, which the multiplier does not touch.
Not from the backtest alone. The Edge Score grades the history you upload, and an overfit run looks strong in that history. Catching it takes out of sample trades: run the strategy through Validation and read the Stability Score, which scores trades the backtest never saw.
Yes. Subtract commissions, spread, and slippage so each trade reflects what you actually keep. Quantprove scores the numbers in your file and cannot tell gross from net, so a cost free log reports a higher EV per trade and a higher Edge Score than the strategy earns.
The lower one. The drop is the inflation leaving. A score built on net trades, a full date range, and a held back sample is the honest read, and an honest read is the one you can plan around.

References

Validate your strategy in under a minute.

Upload your trade log and read your Edge Score, free. No credit card required.

Start for freeHow it works

No credit card required·Swiss Made