Poster
in
Workshop: Table Representation Learning

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

Noah Hollmann · Samuel Müller · Katharina Eggensperger · Frank Hutter

Keywords: causal reasoning Real-time Machine Learning tabular data bayesian inference Green AI AutoML

Project Page [ Poster] [ OpenReview]

Abstract

We present TabPFN, a trained Transformer model that can do tabular supervised classification for small datasets in less than a second, needs no hyperparameter tuning and is competitive with state-of-the-art classification methods.TabPFN is entailed in the weights of our network, which accepts training and test samples as a set-valued input and yields predictions for the entire test set in a single forward pass. TabPFN is a Prior-Data Fitted Network (PFN) and is trained offline once, to approximate Bayesian inference on synthetic datasets drawn from our prior. Our prior incorporates ideas from causal learning: It entails a large space of structural causal models with a preference for simple structures. Afterwards, the trained TabPFN approximates Bayesian prediction on any unseen tabular dataset, without any hyperparameter tuning or gradient-based learning.On 30 datasets from the OpenML-CC18 suite, we show that our method outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with a $70\times$ speedup. This increases to a $3\,200\times$ speedup when a GPU is available.We provide all our code and the trained TabPFN at https://anonymous.4open.science/r/TabPFN-2AEE. We also provide an online demo at https://huggingface.co/spaces/TabPFN/TabPFNPrediction.

Chat is not available.