Decentralized In-Context Learning Benchmark for Tabular Foundation Models
Evaluating decentralized in-context learning across regression datasets, large synthetic tasks, OpenML data, and IID and non-IID settings.
01 / Overview
I am contributing to research on decentralized in-context learning for tabular foundation models. The work focuses on evaluating how models such as TabPFN and TabICL perform across regression datasets, large synthetic tasks, real-world OpenML data, and IID/non-IID experimental settings.
02 / Technical focus
This work combines ML experimentation, benchmark engineering, GPU-aware inference, tabular foundation models, and reproducible research workflows. A major part of the contribution was scaling evaluation beyond small datasets while keeping experiments configurable, repeatable, and aligned with paper settings.
03 / What I worked on
- Expanded a D-ICL evaluation benchmark by integrating 5 public regression datasets into a Python evaluation pipeline.
- Added large-scale synthetic regression tasks using configurable dataset generation.
- Integrated OpenML Yolanda dataset 42705 as a large real-world regression benchmark.
- Adapted experiments to 120k-sample large-regression settings when full-scale runs were too expensive.
- Ran TabPFN and TabICL experiments across IID and non-IID data partitions.
- Evaluated results using RMSE, MAE, R2, centralized baselines, and mean and standard-deviation summaries.
- Implemented batched regression inference to address CUDA memory limits during large test-set prediction.
- Generated reproducible JSON and CSV summaries for research review and paper validation.
- Supported a paper currently under review at NeurIPS.
04 / Tools and topics