Timezone: »

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
Noah Hollmann · Samuel Müller · Katharina Eggensperger · Frank Hutter

Fri Dec 02 12:45 PM -- 01:00 PM (PST) @

We present TabPFN, a trained Transformer model that can do tabular supervised classification for small datasets in less than a second, needs no hyperparameter tuning and is competitive with state-of-the-art classification methods.TabPFN is entailed in the weights of our network, which accepts training and test samples as a set-valued input and yields predictions for the entire test set in a single forward pass. TabPFN is a Prior-Data Fitted Network (PFN) and is trained offline once, to approximate Bayesian inference on synthetic datasets drawn from our prior. Our prior incorporates ideas from causal learning: It entails a large space of structural causal models with a preference for simple structures. Afterwards, the trained TabPFN approximates Bayesian prediction on any unseen tabular dataset, without any hyperparameter tuning or gradient-based learning.On 30 datasets from the OpenML-CC18 suite, we show that our method outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with a $70\times$ speedup. This increases to a $3\,200\times$ speedup when a GPU is available.We provide all our code and the trained TabPFN at https://anonymous.4open.science/r/TabPFN-2AEE. We also provide an online demo at https://huggingface.co/spaces/TabPFN/TabPFNPrediction.

#### Author Information

##### Frank Hutter (University of Freiburg & Bosch)

Frank Hutter is a Full Professor for Machine Learning at the Computer Science Department of the University of Freiburg (Germany), where he previously was an assistant professor 2013-2017. Before that, he was at the University of British Columbia (UBC) for eight years, for his PhD and postdoc. Frank's main research interests lie in machine learning, artificial intelligence and automated algorithm design. For his 2009 PhD thesis on algorithm configuration, he received the CAIAC doctoral dissertation award for the best thesis in AI in Canada that year, and with his coauthors, he received several best paper awards and prizes in international competitions on machine learning, SAT solving, and AI planning. Since 2016 he holds an ERC Starting Grant for a project on automating deep learning based on Bayesian optimization, Bayesian neural networks, and deep reinforcement learning.