Skip to content
ST
ResearchCurrent

Decentralized In-Context Learning Benchmark for Tabular Foundation Models

Evaluating decentralized in-context learning across regression datasets, large synthetic tasks, OpenML data, and IID and non-IID settings.

01 / Overview

I am contributing to research on decentralized in-context learning for tabular foundation models. The work focuses on evaluating how models such as TabPFN and TabICL perform across regression datasets, large synthetic tasks, real-world OpenML data, and IID/non-IID experimental settings.

02 / Technical focus

This work combines ML experimentation, benchmark engineering, GPU-aware inference, tabular foundation models, and reproducible research workflows. A major part of the contribution was scaling evaluation beyond small datasets while keeping experiments configurable, repeatable, and aligned with paper settings.

03 / What I worked on

  • Expanded a D-ICL evaluation benchmark by integrating 5 public regression datasets into a Python evaluation pipeline.
  • Added large-scale synthetic regression tasks using configurable dataset generation.
  • Integrated OpenML Yolanda dataset 42705 as a large real-world regression benchmark.
  • Adapted experiments to 120k-sample large-regression settings when full-scale runs were too expensive.
  • Ran TabPFN and TabICL experiments across IID and non-IID data partitions.
  • Evaluated results using RMSE, MAE, R2, centralized baselines, and mean and standard-deviation summaries.
  • Implemented batched regression inference to address CUDA memory limits during large test-set prediction.
  • Generated reproducible JSON and CSV summaries for research review and paper validation.
  • Supported a paper currently under review at NeurIPS.

04 / Tools and topics

PythonPyTorch / CUDANumPyPandasscikit-learnTabPFNTabICLOpenMLGitJSON / CSV outputsBatched inferenceTabular foundation modelsD-ICLLarge-scale regressionReproducible ML experiments