Trading Strategy Backtesting

Overview

Backtesting is the process of applying a trading strategy to historical market data to evaluate how it would have performed. It is the primary validation tool available to systematic traders before committing real capital to a strategy — the mechanism that separates strategies with a genuine statistical edge from strategies that look compelling in a chart pattern or a theoretical framework but do not hold up when tested against historical price action.

Done well, backtesting produces quantitative evidence about a strategy's historical performance: its return profile, its drawdown characteristics, its risk-adjusted performance metrics, its behaviour across different market regimes, and the sensitivity of its results to the specific parameter choices the strategy uses. This evidence does not guarantee future performance, but it provides a principled basis for assessing whether a strategy has the characteristics — consistent edge, manageable drawdown, acceptable risk-adjusted return — that justify allocating capital to it.

Done poorly, backtesting produces results that look better than the strategy deserves. Overfitting to historical data produces strategies that describe the past precisely but have no predictive value for the future. Failing to account for transaction costs produces optimistic returns that disappear when realistic execution costs are applied. Ignoring market impact, slippage, and liquidity produces fills at prices that would not have been achievable in live trading. Lookahead bias — using data that would not have been available at the time of the trading decision — invalidates the entire result.

Custom backtesting frameworks are built to produce reliable results — implementing the simulation methodology, the transaction cost modelling, the data handling, and the statistical analysis that rigorous strategy validation requires, rather than the simplified approaches that produce the attractive but misleading results that naive backtesting generates.

We build custom backtesting frameworks for systematic traders, quantitative research teams, and algorithmic trading operations that need backtesting infrastructure specific to the instruments they trade, the data sources they use, the execution model that reflects their live trading reality, and the analytical framework that their strategy research process requires.

What Backtesting Frameworks Cover

Event-driven simulation engine. The simulation architecture that processes historical market data event by event — tick by tick or bar by bar — applying the strategy logic to each data point as if it were arriving in real time. Event-driven simulation is the correct architecture for backtesting strategies that interact with intraday price movements, that place orders at specific price levels, or that manage positions based on intraday events rather than end-of-day closes.

The event-driven engine processes market data events — price updates, volume data, order book snapshots — in chronological sequence, presenting each event to the strategy's signal logic in the order it would have been received in live trading. Strategy decisions made on each event are simulated through the execution model that converts those decisions into simulated fills, with the fill logic that reflects the execution reality of the market and venue being simulated.

For bar-based strategies — strategies that make decisions at the close of each bar rather than responding to intraday price movements — bar-by-bar simulation with realistic bar data provides a computationally efficient backtesting approach that handles large data sets and long historical periods with manageable processing time.

Data management and quality. Backtesting results are only as reliable as the data they are built on. Data management in the backtesting framework handles the historical data sources that the simulation depends on — acquiring, storing, validating, and serving the price data, volume data, and derived data that the strategy uses.

Tick data for high-frequency and intraday strategies — the full price and volume history at the finest granularity available for the instruments being tested. OHLCV bar data for daily and swing strategies — adjusted for corporate actions (dividends, splits) so that the price series reflects the actual returns an investor would have experienced rather than the unadjusted price history that creates fictitious gaps at corporate action dates.

Bid-ask spread data where available — the spread that determines the realistic cost of market orders and the fill price of limit orders near the touch. Without spread data, the backtesting framework applies spread assumptions based on historical average spread characteristics for each instrument and market condition.

Data quality validation — identifying and handling the data errors (missing bars, erroneous prices, duplicate records) that are present in most historical data sources — prevents the simulation artefacts that unvalidated data produces: phantom trades triggered by erroneous price spikes, position errors from missing bars, misleading performance metrics from corrupted data.

Transaction cost modelling. Transaction costs are the difference between theoretical backtest returns and realised live trading returns. A strategy that generates 15% annual return before costs and has 2% annual transaction costs has a 13% live return — the 2% cost drag is the price of execution. A strategy that generates 8% before costs with 2% transaction costs is barely worth running. Accurate transaction cost modelling is not a detail — it is a fundamental input that determines whether a strategy's edge survives implementation.

Transaction cost modelling in the backtesting framework accounts for the full range of costs that live trading incurs: broker commissions applied per trade at the rate that the live trading account pays, bid-ask spread cost applied at the estimated spread for the instrument and market condition at the time of the simulated trade, slippage applied as a function of the simulated order size relative to the available liquidity, and market impact for strategies where position sizes are significant relative to typical trading volumes.

Cost sensitivity analysis — running the backtest at multiple cost assumptions to identify the cost level at which the strategy's edge disappears — produces the break-even cost estimate that tells the trader how much execution quality they need to maintain for the strategy to remain viable.

Execution simulation. The simulated execution model determines how the strategy's trading decisions translate into fills. Realistic execution simulation matters because the difference between what a strategy tries to do and what the market allows it to do is a significant source of the gap between backtest and live performance.

Market order execution simulation applies the estimated bid-ask spread and slippage to market orders — the fill is not at the mid-price but at the ask for buys and the bid for sells, with additional slippage for orders large enough to move the market. Limit order execution simulation models the probability of limit order fill based on the relationship between the limit price and the market price history — a limit order at a price that the market only briefly touched has a lower fill probability than one at a price the market spent significant time at. Stop order simulation accounts for the gap risk that stop orders face in illiquid conditions and the slippage that stop orders typically experience when triggered in fast-moving markets.

For venue-specific execution — MetaTrader forex brokers, cryptocurrency exchanges with specific maker-taker fee structures, futures exchanges with specific tick sizes and margin requirements — the execution simulation is configured to reflect the specific execution model of the venue the strategy will trade on in live trading.

Position and portfolio tracking. The backtesting framework maintains the simulated portfolio state through the simulation — the positions open at each point in time, the cash available after margin requirements are satisfied, the P&L on each position, and the portfolio-level P&L and metrics that the strategy evaluation depends on.

Position tracking handles the full range of position events that strategies produce: entries, exits, partial entries and exits, position scaling, stop loss and take profit execution, and the position expiry events that futures and options strategies generate. Multi-instrument portfolio tracking — the aggregate portfolio state across all instruments that a multi-instrument strategy trades simultaneously — provides the portfolio-level analysis that single-instrument backtesting cannot capture.

Performance analytics. The backtest output that the strategy evaluation depends on — the performance metrics that characterise the strategy's historical behaviour and enable comparison between strategy variants.

Return metrics: total return, annualised return, monthly return distribution, calendar year returns. Risk metrics: maximum drawdown, average drawdown, drawdown duration, volatility, downside deviation. Risk-adjusted metrics: Sharpe ratio, Sortino ratio, Calmar ratio, information ratio against a benchmark. Trade statistics: win rate, average win, average loss, profit factor, average holding period, number of trades. Regime analysis: performance by market regime (trending, ranging, high volatility, low volatility) to understand the market conditions under which the strategy performs and underperforms.

Equity curve analysis — the visual and statistical analysis of the portfolio value over time — surfaces the drawdown periods, the recovery periods, and the consistency of returns that the summary statistics may obscure.

Walk-forward testing. In-sample optimisation — fitting strategy parameters to a historical data window — is the source of the overfitting that makes backtested results better than live results. Walk-forward testing is the methodology that reduces overfitting by repeatedly optimising on a training window and testing on the subsequent out-of-sample window, repeating this process across the full historical period to produce an out-of-sample performance estimate that is less contaminated by overfitting than a single in-sample optimisation.

Walk-forward testing infrastructure automates the walk-forward process — the repeated cycles of in-sample optimisation and out-of-sample testing across the full historical period — and assembles the out-of-sample results into the performance record that provides the most reliable estimate of the strategy's live performance potential.

Monte Carlo simulation. Historical backtesting produces a single path through time — the path that actually happened. Monte Carlo simulation produces a distribution of possible outcomes by randomly sampling from the strategy's historical trade results to generate thousands of alternative performance paths. The Monte Carlo distribution shows the range of drawdowns the strategy could experience, the range of returns it could produce, and the probability of the worst-case outcomes that the single historical path may not have exhibited.

Monte Carlo analysis provides the probabilistic risk assessment that complements the deterministic historical backtest — the answer to the question "what is the worst outcome I should plan for?" rather than "what happened historically?"

Parameter sensitivity analysis. Most trading strategies have parameters — the lookback period for a moving average, the multiplier for an ATR-based stop, the threshold for a momentum signal. Optimising these parameters on the historical data produces the values that performed best historically. Sensitivity analysis examines how the strategy's performance varies across a range of parameter values, identifying whether the optimal parameters are robust — the strategy performs well across a range of parameter values near the optimum — or fragile — the strategy only performs well at a narrow set of parameter values that may not generalise to future conditions.

Integration With Live Trading Infrastructure

Backtesting frameworks that are architecturally consistent with the live trading infrastructure reduce the implementation risk of transitioning a strategy from backtest to live trading. When the same signal logic runs in backtest and in live execution, the performance gap attributable to signal implementation differences is minimised.

Signal library consistency. Signal logic developed and validated in the backtesting framework deployed directly to the live execution system — the same code that generates signals in backtesting generates signals in live trading, eliminating the re-implementation differences that create discrepancies between backtest and live performance.

Data pipeline integration. The same data sources and data processing pipelines that feed the backtesting simulation feed the live execution system — ensuring that the data the live system uses is consistent with the data the backtest was validated on.

Paper trading bridge. Paper trading — executing the strategy in a simulated environment using live market data — bridges the gap between backtesting on historical data and live trading with real capital. The backtesting framework's execution simulation runs against live data in paper trading mode, providing the most realistic pre-live validation available before capital is committed.

Technologies Used

Rust — high-performance event-driven simulation engine, tick data processing, large-scale parameter optimisation, Monte Carlo simulation
Python — strategy research and signal development, statistical analysis, performance analytics, data acquisition and processing, walk-forward testing automation
C# / ASP.NET Core — backtesting service API, live trading infrastructure integration, historical data management
React / Next.js — backtesting interface, performance analytics dashboard, equity curve visualisation, parameter sensitivity views
TypeScript — type-safe frontend and API code throughout
SQL (PostgreSQL, MySQL, SQLite) — historical price data storage, backtest results, trade records, parameter sets
Redis — backtesting job queuing, result caching, real-time simulation state
Parquet / columnar storage — efficient storage and retrieval of large tick data and OHLCV datasets
NumPy / Pandas — numerical computation and time-series data processing in Python strategy research
REST / Webhooks — data vendor API integration, live trading system connectivity

The Gap Between Backtest and Live Performance

The performance gap between backtesting results and live trading results is real, predictable in its sources, and manageable with the right simulation methodology. Overfitting inflates backtest returns by finding parameters that describe the past rather than predicting the future. Unrealistic execution assumptions produce fills that would not have been achieved in live trading. Lookahead bias uses information that was not available at decision time. Survivorship bias tests only the instruments that survived to the end of the historical period, ignoring the instruments that were delisted or defaulted.

Custom backtesting frameworks address each of these sources of backtest inflation — implementing walk-forward testing to manage overfitting, realistic execution simulation to manage execution assumptions, careful data handling to prevent lookahead bias, and full instrument universe data that includes delisted instruments to prevent survivorship bias. The result is backtesting results that are less flattering than naive approaches produce but more honest about the strategy's actual potential.

Test Thoroughly, Trade Confidently

The purpose of backtesting is not to produce attractive performance statistics — it is to produce reliable evidence about a strategy's behaviour. Reliable evidence supports confident capital allocation. Unreliable evidence, however attractive, leads to capital deployed in strategies that were never as good as the backtest suggested.