Skip to content
ST

MacroMarkets ML

A release-aware R forecasting project testing whether unemployment and recent market behavior predict next-month S&P 500 direction.

01 / Overview

MacroMarkets ML is a reproducible R project that tests whether unemployment data and recent market behavior can help classify next-month S&P 500 direction. The project was designed as an honest forecasting exercise rather than a trading strategy, with careful attention to data timing, look-ahead bias, chronological validation, and baseline comparison.

02 / Problem

Financial forecasting projects can easily overstate performance if macroeconomic data are not aligned to when they would have actually been available. This project explored whether a small set of macro and market features contained useful directional signal after applying conservative release-aware assumptions.

03 / What I built

  • Developed an end-to-end R analysis pipeline using S&P 500 market data and FRED unemployment data.
  • Engineered release-aware features so unemployment observations were aligned to when they would have been publicly available.
  • Converted daily market data into monthly return features using correct compounding of daily log returns.
  • Built a logistic regression classifier to predict whether the next month's S&P 500 return would be positive.
  • Used chronological holdout validation and train-only feature standardization to avoid leakage.
  • Evaluated performance with accuracy, balanced accuracy, ROC AUC, Brier score, and confusion matrices.
  • Compared the model against the market's historical positive-month baseline instead of only reporting raw accuracy.
  • Published a reproducible HTML report using R Markdown and GitHub Pages.

04 / Key results

  • The release-aware logistic regression model achieved 55.7% out-of-sample accuracy and 0.49 ROC AUC, compared with a 68.9% historical positive-month baseline.
  • The model did not show useful next-month directional signal from this small feature set, and the negative result was reported directly rather than tuned away.

05 / Technical focus

This project demonstrates practical financial ML discipline: release-aware feature engineering, chronological validation, baseline comparison, reproducible reporting, and transparent treatment of negative results. It shows that I can evaluate a forecasting idea honestly instead of optimizing only for impressive-looking accuracy.

06 / Tech stack

RR Markdowntidyversedplyrggplot2tidyquantquantmodLogistic regressionTime-series feature engineeringChronological holdout validationYahoo Finance dataFRED unemployment datarenvGitGitHub PagesHTMLCSS