2 S&P Weekly Predictions

Let’s paint a picture fo the scenario we find ourselves in. You work at a small office in Austin, Texas as a financial analyst.

It’s likely too ambitious to believe we can correctly forecast the actual S&P 500 values, but more likely, we assume that we can at least discover which methods do not work.

2.1 Downloading S&P 500 daily Data

Now, let’s add on some more columns that we know we’ll want to use in our analysis later on:

  • \(PriceChange = Price_t - Price_{t-1}\)
  • \(Return = \frac{(Price_t - Price_{t-1})}{Price_{t-1}}\)

2.2 EDA

How about a little brief EDA, just to make sure we get a sense for what’s going on??

Well that is kind of hard to read. Maybe log transforming it will make it more interesting?

I suppose the takeaway from above analysis is that the S&P 500 does tend to go up over time. Not sure if that is really all that insightful.

But we can see below that the price change is clearly no where near constant overtime.

One of the takeaways I’d see from here is that there seem to be periods of high-volatility: big % changes are followed by big % changes.

2.3 Modeling

EDA should give us some idea of the proper mathematical models to use to fit the data.

2.3.1 Base Model

I think it is important to always start with a base model. You need some sort of benchmark with which to judge the performance of your more complicated models later on. If the fancier models do not yield much additional predictive or explanatory power, then they most likely are not worth the added complexity.

The value next week is equal to today’s value. This makes the assumption that the data is following a random walk.

2.3.2 Rolling Mean

2.3.3 Base Model with Economic Intuition

Model that shows accounts for stock’s having a risk premium. Equivalent to geometric random walk, with a drift.

2.3.4 Time-Series Regression

2.3.5 Cross-Validation

A better approach would involve using cross-validation to choose the appropriate model parameters.

2.3.6 Simplifying Problem

Essential question: by simplifying the input data, our hope is that the loss of information we have from throwing away data is outweighed by the gain garnered from eliminating some of the noise.

2.3.6.1 Predicting Up or Down

2.4 Conclusions

  • Which approach is best
  • A high R^2 does not mean we have a profitable trading strategy…
  • Addding additional predictor variables would be nice.