What is Curve Fitting?
One of the greatest appeals and advantages of mechanical trading systems is the ability to evaluate their historical performance by “backtesting” the strategies on historical price data. While we may have just a handful of months of actual performance data available, computers and backadjusted data make it possible to see what a system “would have done” going back years and years.
The problem, of course, is that the system has been designed on this very same data. Whether intentional or not, because systems are designed on past data, they are often the victims of what we call “curve fitting”, making the ability to backtest results one of the biggest disadvantages of trading systems also. Because the future may look nothing like the past in a particular market – the “fitting” of parameters onto the past “curve” of data may cause big problems on the future data curve, causing the system to be out of phase and potentially causing investors losses.
The easiest way to understand “curve fitting” is through a simple example. Imagine a system that buys or sells Soybean futures on a breakout above or below the market high or low for the past X number of days. When testing the system on the past data, the testing may show $5,000 in profits when using a 10 day high/low, $10,000 in profits when using a 20 day high/low, and $20,000 when using a 30 day high/low.
If you were the developer, which value would you use in designing the system, 10, 20, or 30? I would guess most people would use the 30 value, as it gives the highest profit. Now a developer will look at more than just profit, and test for lowest drawdown or most winning months, for example; but whatever your goal for the system, it is human nature to design a system whose parameters produce results as close as possible to those desired. The problem is, just because one parameter worked on the past data does not mean it will work on the future, unknown data. So how would a developer attempt to avoid such a problem?
Another example of curve fitting is adjusting parameters after the fact. Imagine a trading model which is doing well, but suffers a rather big loss on three out of four Wednesday afternoons since live trading began. A trader might look at those results and come up with a brilliant plan, code the system not to take any trades on Wednesdays after 1:30pm. Running the code backwards after putting in the new logic would result in those Wednesday losing trades going away, and voila – you have curve fitting.
To combat curve fitting – developers use many tricks of the trade such as testing on out of sample data, not optimizing parameters for the best backtested results (instead using logic based parameters – such as a 5 day moving average to align with the week), and making sure there are as few parameters as possible in the system (more parameters equals more degrees of freedom = more things to go wrong).
Next week, we’ll delve into how an investor can make sure they are trading a “Curve-fit Free” Trading System.