Timezone: »

OpenML Benchmarking Suites
Bernd Bischl · Giuseppe Casalicchio · Matthias Feurer · Pieter Gijsbers · Frank Hutter · Michel Lang · Rafael Gomes Mantovani · Jan van Rijn · Joaquin Vanschoren
Event URL: https://openreview.net/forum?id=OCrD8ycKjG »

Machine learning research depends on objectively interpretable, comparable, and reproducible algorithm benchmarks. We advocate the use of curated, comprehensive suites of machine learning tasks to standardize the setup, execution, and reporting of benchmarks. We enable this through software tools that help to create and leverage these benchmarking suites. These are seamlessly integrated into the OpenML platform, and accessible through interfaces in Python, Java, and R. OpenML benchmarking suites (a) are easy to use through standardized data formats, APIs, and client libraries; (b) come with extensive meta-information on the included datasets; and (c) allow benchmarks to be shared and reused in future studies. We then present a first, carefully curated and practical benchmarking suite for classification: the OpenML Curated Classification benchmarking suite 2018 (OpenML-CC18). Finally, we discuss use cases and applications which demonstrate the usefulness of OpenML benchmarking suites and the OpenML-CC18 in particular.

Author Information

Bernd Bischl (LMU Munich)
Giuseppe Casalicchio (LMU Munich)
Matthias Feurer (University of Freiburg)
Pieter Gijsbers (Eindhoven University of Technology)
Frank Hutter (University of Freiburg & Bosch)

Frank Hutter is a Full Professor for Machine Learning at the Computer Science Department of the University of Freiburg (Germany), where he previously was an assistant professor 2013-2017. Before that, he was at the University of British Columbia (UBC) for eight years, for his PhD and postdoc. Frank's main research interests lie in machine learning, artificial intelligence and automated algorithm design. For his 2009 PhD thesis on algorithm configuration, he received the CAIAC doctoral dissertation award for the best thesis in AI in Canada that year, and with his coauthors, he received several best paper awards and prizes in international competitions on machine learning, SAT solving, and AI planning. Since 2016 he holds an ERC Starting Grant for a project on automating deep learning based on Bayesian optimization, Bayesian neural networks, and deep reinforcement learning.

Michel Lang
Rafael Gomes Mantovani (Federal Technology University of Paraná)
Jan van Rijn (Columbia University)
Joaquin Vanschoren (Eindhoven University of Technology)
Joaquin Vanschoren

Joaquin Vanschoren is Associate Professor in Machine Learning at the Eindhoven University of Technology. He holds a PhD from the Katholieke Universiteit Leuven, Belgium. His research focuses on understanding and automating machine learning, meta-learning, and continual learning. He founded and leads OpenML.org, a popular open science platform with over 250,000 users that facilitates the sharing and reuse of machine learning datasets and models. He is a founding member of the European AI networks ELLIS and CLAIRE, and an active member of MLCommons. He obtained several awards, including an Amazon Research Award, an ECMLPKDD Best Demo award, and the Dutch Data Prize. He was a tutorial speaker at NeurIPS 2018 and AAAI 2021, and gave over 30 invited talks. He co-initiated the NeurIPS Datasets and Benchmarks track and was NeurIPS Datasets and Benchmarks Chair from 2021 to 2023. He also co-organized the AutoML workshop series at ICML, and the Meta-Learning workshop series at NeurIPS. He is editor-in-chief of DMLR (part of JMLR), as well as an action editor for JMLR and machine learning moderator for ArXiv. He authored and co-authored over 150 scientific papers, as well as reference books on Automated Machine Learning and Meta-learning.

More from the Same Authors