ResearchCurrent

Decentralized In-Context Learning Benchmark for Tabular Foundation Models

Evaluating decentralized in-context learning across regression datasets, large synthetic tasks, OpenML data, and IID and non-IID settings.

01 / Overview

I am contributing to research on decentralized in-context learning for tabular foundation models. The work focuses on evaluating how models such as TabPFN and TabICL perform across regression datasets, large synthetic tasks, real-world OpenML data, and IID/non-IID experimental settings.

02 / Technical focus

This work combines ML experimentation, benchmark engineering, GPU-aware inference, tabular foundation models, and reproducible research workflows. A major part of the contribution was scaling evaluation beyond small datasets while keeping experiments configurable, repeatable, and aligned with paper settings.

03 / What I worked on

Expanded a D-ICL evaluation benchmark by integrating 5 public regression datasets into a Python evaluation pipeline.
Added large-scale synthetic regression tasks using configurable dataset generation.
Integrated OpenML Yolanda dataset 42705 as a large real-world regression benchmark.
Adapted experiments to 120k-sample large-regression settings when full-scale runs were too expensive.
Ran TabPFN and TabICL experiments across IID and non-IID data partitions.
Evaluated results using RMSE, MAE, R2, centralized baselines, and mean and standard-deviation summaries.
Implemented batched regression inference to address CUDA memory limits during large test-set prediction.
Generated reproducible JSON and CSV summaries for research review and paper validation.
Supported a paper currently under review at NeurIPS.

04 / Tools and topics

PythonPyTorch / CUDANumPyPandasscikit-learnTabPFNTabICLOpenMLGitJSON / CSV outputsBatched inferenceTabular foundation modelsD-ICLLarge-scale regressionReproducible ML experiments

GitHub Back to experience