MacroMarkets ML
A release-aware R forecasting project testing whether unemployment and recent market behavior predict next-month S&P 500 direction.
01 / Overview
MacroMarkets ML is a reproducible R project that tests whether unemployment data and recent market behavior can help classify next-month S&P 500 direction. The project was designed as an honest forecasting exercise rather than a trading strategy, with careful attention to data timing, look-ahead bias, chronological validation, and baseline comparison.
02 / Problem
Financial forecasting projects can easily overstate performance if macroeconomic data are not aligned to when they would have actually been available. This project explored whether a small set of macro and market features contained useful directional signal after applying conservative release-aware assumptions.
03 / What I built
- Developed an end-to-end R analysis pipeline using S&P 500 market data and FRED unemployment data.
- Engineered release-aware features so unemployment observations were aligned to when they would have been publicly available.
- Converted daily market data into monthly return features using correct compounding of daily log returns.
- Built a logistic regression classifier to predict whether the next month's S&P 500 return would be positive.
- Used chronological holdout validation and train-only feature standardization to avoid leakage.
- Evaluated performance with accuracy, balanced accuracy, ROC AUC, Brier score, and confusion matrices.
- Compared the model against the market's historical positive-month baseline instead of only reporting raw accuracy.
- Published a reproducible HTML report using R Markdown and GitHub Pages.
04 / Key results
- The release-aware logistic regression model achieved 55.7% out-of-sample accuracy and 0.49 ROC AUC, compared with a 68.9% historical positive-month baseline.
- The model did not show useful next-month directional signal from this small feature set, and the negative result was reported directly rather than tuned away.
05 / Technical focus
This project demonstrates practical financial ML discipline: release-aware feature engineering, chronological validation, baseline comparison, reproducible reporting, and transparent treatment of negative results. It shows that I can evaluate a forecasting idea honestly instead of optimizing only for impressive-looking accuracy.
06 / Tech stack