Training a simple weekly resistance model for SPY (Part 1)

July 18, 2024 (2mo ago)

This is my first attempt at developing a machine learning model for quantitative analysis and trading. I don't know what to expect, but I do know I will encounter a lot of challenges along the way. This model will be very simple and likely under/overfit, inaccurate, and not useful for trading.

Objective: Create a simple XGBoost model to predict weekly resistance levels for SPY.

Data

  • SPY daily price data (20 years)

Preprocessing the data

  1. Columns: datetime, high, low
  2. Round the high and low to ticks (1% of the price to 1 significant digit)
  3. Add a column for the day of the week (0-4)

SPY Daily High and Low Prices

Preparing the dataset

  • Features: array of high and low prices
  • 10 days of data (2 weeks from Monday to Friday)
  • Exclude data if it doesn't have 10 days of data
  • Target: next week's high and low price for the week

Model

Simple XGBRegressor model with 100 estimators, 0.1 learning rate and default parameters.

Data is also scaled using StandardScaler.

I know this is very bad, but I want to start simple and then improve it.

A model is trained for the high and low prices separately.

Results

XGBoost Regressor Performance for High Data:
Train MSE: 0.3408
Test MSE: 26.6598
Train R2: 1.0000
Test R2: 0.9980
XGBoost Regressor Performance for Low Data:
Train MSE: 0.2902
Test MSE: 56.1965
Train R2: 1.0000
Test R2: 0.9958

Analysis

The model looks great, right? No.

$R^2$ for training is perfect ($1.0$) and test is very high ($0.9980$ and $0.9958$) is very high too.

But, I know this is very bad. The model is likely overfitting and not generalizing well to be useful for trading.

Visualizing Error

XGBoost Regressor Predictions Errors

XGBoost Regressor Predictions Errors over Time


NEXT: Training a simple weekly resistance model for SPY (Part 2)