What You Are Building
A Python backtesting workflow that takes any trading strategy generated by Claude Code, tests it against historical price data, and produces real performance metrics: total return, Sharpe ratio, max drawdown, and trade-by-trade logs. You will use vectorbt for fast backtests and Claude Code to generate the strategy logic.
Why Backtesting Matters
Writing a trading bot is the easy part. Knowing whether it would have made or lost money in the past is what separates a toy project from a real strategy. Every bot tutorial on this site — DCA bots, momentum strategies, news-reactive bots — produces a strategy you can test on paper. But paper trading only shows you forward performance. Backtesting lets you see how the strategy would have handled the 2022 crypto crash, the 2023 rally, or the March 2026 liquidation cascade.
The catch: backtesting is easy to do wrong. Bad data, lookahead bias, and overfitting to historical patterns can make a losing strategy look profitable. This tutorial covers how to avoid those mistakes.
Prerequisites
- Python 3.10+
- Claude Code installed and working
- Basic Python familiarity
- No prior backtesting experience needed
Step 1: Set Up the Environment
Create a project and install dependencies:
mkdir backtest-lab && cd backtest-lab
python -m venv venv
source venv/bin/activate
pip install vectorbt yfinance pandas numpy matplotlib
vectorbt is a Python library built for fast backtesting. It uses NumPy under the hood, so it can test thousands of parameter combinations in seconds. yfinance provides free historical price data from Yahoo Finance.
Step 2: Fetch Historical Data
Start Claude Code and prompt it to build the data pipeline:
Write a Python script that:
- Uses yfinance to download daily OHLCV data for a given ticker symbol
- Accepts the ticker, start date, and end date as command-line arguments
- Saves the data to a CSV file in a /data directory
- Prints the date range, number of rows, and any missing days
- Handle stock splits and dividend adjustments (use adjusted close)
Claude Code will produce something like this:
import yfinance as yf
import pandas as pd
import sys
import os
def fetch_data(ticker, start, end):
os.makedirs("data", exist_ok=True)
df = yf.download(ticker, start=start, end=end, auto_adjust=True)
if df.empty:
print(f"No data returned for {ticker}")
return None
output_path = f"data/{ticker}_{start}_{end}.csv"
df.to_csv(output_path)
trading_days = pd.bdate_range(start, end)
missing = trading_days.difference(df.index)
print(f"Ticker: {ticker}")
print(f"Date range: {df.index[0].date()} to {df.index[-1].date()}")
print(f"Total rows: {len(df)}")
print(f"Missing trading days: {len(missing)}")
return df
if __name__ == "__main__":
ticker = sys.argv[1] if len(sys.argv) > 1 else "SPY"
start = sys.argv[2] if len(sys.argv) > 2 else "2020-01-01"
end = sys.argv[3] if len(sys.argv) > 3 else "2026-05-01"
fetch_data(ticker, start, end)
Run it:
python fetch_data.py SPY 2020-01-01 2026-05-01
python fetch_data.py BTC-USD 2020-01-01 2026-05-01
Step 3: Build and Backtest an SMA Crossover Strategy
Now prompt Claude Code to create the backtest:
Write a Python backtesting script using vectorbt that:
- Loads OHLCV data from a CSV file
- Implements a simple moving average crossover strategy (fast SMA crosses above slow SMA = buy, crosses below = sell)
- Accepts fast_period and slow_period as parameters (default 20 and 50)
- Runs the backtest with an initial capital of $10,000
- Prints: total return %, annualized return %, Sharpe ratio, max drawdown %, total trades, win rate %
- Generates an equity curve chart saved as PNG
- Exports a trade log to CSV with entry date, exit date, entry price, exit price, PnL, and holding period
The core backtest code with vectorbt:
import vectorbt as vbt
import pandas as pd
import sys
def run_backtest(csv_path, fast_period=20, slow_period=50, initial_cash=10000):
df = pd.read_csv(csv_path, index_col=0, parse_dates=True)
close = df["Close"]
fast_ma = vbt.MA.run(close, window=fast_period)
slow_ma = vbt.MA.run(close, window=slow_period)
entries = fast_ma.ma_crossed_above(slow_ma)
exits = fast_ma.ma_crossed_below(slow_ma)
portfolio = vbt.Portfolio.from_signals(
close,
entries=entries,
exits=exits,
init_cash=initial_cash,
fees=0.001, # 0.1% per trade
freq="1D",
)
stats = portfolio.stats()
print("\n=== Backtest Results ===")
print(f"Total Return: {stats['Total Return [%]']:.2f}%")
print(f"Annualized Return:{stats['Annualized Return [%]']:.2f}%")
print(f"Sharpe Ratio: {stats['Sharpe Ratio']:.2f}")
print(f"Max Drawdown: {stats['Max Drawdown [%]']:.2f}%")
print(f"Total Trades: {stats['Total Trades']}")
print(f"Win Rate: {stats['Win Rate [%]']:.2f}%")
# Save equity curve
fig = portfolio.plot()
fig.write_image("equity_curve.png")
print("\nEquity curve saved to equity_curve.png")
# Export trade log
trades = portfolio.trades.records_readable
trades.to_csv("trade_log.csv", index=False)
print(f"Trade log saved to trade_log.csv ({len(trades)} trades)")
return portfolio
if __name__ == "__main__":
csv_path = sys.argv[1] if len(sys.argv) > 1 else "data/SPY_2020-01-01_2026-05-01.csv"
fast = int(sys.argv[2]) if len(sys.argv) > 2 else 20
slow = int(sys.argv[3]) if len(sys.argv) > 3 else 50
run_backtest(csv_path, fast, slow)
Run it:
python backtest.py data/SPY_2020-01-01_2026-05-01.csv 20 50
You will get output like:
=== Backtest Results ===
Total Return: 34.82%
Annualized Return:5.12%
Sharpe Ratio: 0.41
Max Drawdown: 18.73%
Total Trades: 23
Win Rate: 43.48%
A Sharpe ratio below 1.0 and a 43% win rate is typical for a basic SMA crossover on SPY. The strategy works through its winners being larger than its losers, not through high accuracy.
Step 4: Parameter Optimization
Test many SMA combinations at once to find which periods work best:
Add parameter optimization to the backtest:
- Test fast periods from 5 to 50 (step 5) and slow periods from 20 to 200 (step 10)
- Only test combinations where fast < slow
- For each combination, record: total return, Sharpe ratio, max drawdown, number of trades
- Print the top 10 combinations sorted by Sharpe ratio
- Generate a heatmap of Sharpe ratios across the parameter grid, saved as PNG
import itertools
def optimize(csv_path, initial_cash=10000):
df = pd.read_csv(csv_path, index_col=0, parse_dates=True)
close = df["Close"]
fast_periods = range(5, 55, 5)
slow_periods = range(20, 210, 10)
results = []
for fast, slow in itertools.product(fast_periods, slow_periods):
if fast >= slow:
continue
fast_ma = vbt.MA.run(close, window=fast)
slow_ma = vbt.MA.run(close, window=slow)
entries = fast_ma.ma_crossed_above(slow_ma)
exits = fast_ma.ma_crossed_below(slow_ma)
pf = vbt.Portfolio.from_signals(
close, entries=entries, exits=exits,
init_cash=initial_cash, fees=0.001, freq="1D"
)
stats = pf.stats()
results.append({
"fast": fast,
"slow": slow,
"return_pct": stats["Total Return [%]"],
"sharpe": stats["Sharpe Ratio"],
"max_dd": stats["Max Drawdown [%]"],
"trades": stats["Total Trades"],
})
results_df = pd.DataFrame(results)
top10 = results_df.nlargest(10, "sharpe")
print("\n=== Top 10 Parameter Combinations (by Sharpe Ratio) ===")
print(top10.to_string(index=False))
return results_df
Step 5: Backtest a Strategy from Another Tutorial
The real value is backtesting strategies you actually plan to trade. If you built a momentum bot or a DCA bot, you can test that exact logic against historical data.
Prompt Claude Code:
Convert the momentum strategy from my momentum_bot.py into a vectorbt backtest. The strategy rules are:
- Buy when 14-day RSI crosses above 30 from below AND price is above the 50-day SMA
- Sell when RSI crosses above 70 OR price drops below the 50-day SMA
- Use the same position sizing and fee assumptions
- Run it on SPY daily data from 2020 to 2026
Claude Code will translate the live trading logic into vectorbt signals. The key difference: in live trading you deal with partial candles, API latency, and slippage. In a backtest, execution is instant at the close price. This is an important gap to understand.
Common Backtesting Mistakes
These are the errors that make bad strategies look good:
Lookahead Bias
Your strategy accidentally uses future data to make decisions. This happens when you calculate indicators on the full dataset before splitting into in-sample and out-of-sample periods. Always calculate indicators using only data available up to the decision point.
Survivorship Bias
If you test a stock screening strategy using today’s S&P 500 list, you are only testing companies that survived. Companies that went bankrupt or were delisted are not in your test data, and they would have been sell signals or losses. For index-level backtests (SPY, QQQ), this is less of an issue because you are testing the index itself.
Overfitting
If your optimized parameters work perfectly on 2020-2024 data but fail on 2025-2026 data, you have overfit to the training period. Split your data: use 2020-2023 for optimization and 2024-2026 for validation. If the Sharpe ratio drops by more than 50% on the validation set, the strategy is probably curve-fitted.
Ignoring Trading Costs
A strategy that trades 500 times per year with 0.1% fees per trade loses 50% of its capital to fees alone before any market returns. Always include realistic fee assumptions. For crypto, 0.1% is standard. For stocks through a broker like Alpaca, fees are zero but you still have spread costs of roughly 0.01-0.05%.
Unrealistic Fill Assumptions
Backtests assume you can buy or sell at the exact close price. In reality, your limit order might not fill, or you might get a worse price during volatile moments. Adding 0.05-0.1% slippage to your backtest makes results more realistic.
Reading Your Results
Here is how to interpret the key metrics:
| Metric | Good | Okay | Bad |
|---|---|---|---|
| Sharpe Ratio | > 1.5 | 0.5 - 1.5 | < 0.5 |
| Max Drawdown | < 15% | 15% - 30% | > 30% |
| Win Rate (trend following) | > 40% | 30% - 40% | < 30% |
| Win Rate (mean reversion) | > 55% | 45% - 55% | < 45% |
| Profit Factor | > 2.0 | 1.2 - 2.0 | < 1.2 |
A strategy does not need a high win rate to be profitable. Trend-following strategies often win only 35-45% of trades but make 2-3x more on winners than they lose on losers.
Next Steps
- Backtest every strategy before running it live or on paper
- Use out-of-sample validation (train on one period, test on another)
- Compare your strategy to a simple buy-and-hold benchmark — if you cannot beat holding, the complexity is not worth it
- Try backtesting on different assets: crypto (BTC DCA strategy), forex (forex bot), or individual stocks
- Read our AI trading 101 guide for foundational concepts
- Check the MCP servers guide for connecting Claude Code to live data feeds after your backtest looks promising