Timezone: »

Metrics Reloaded
Annika Reinke · Lena Maier-Hein · Patrick Scholz · Minu D. Tizabi · Evangelia Christodoulou · Ben Glocker · Fabian Isensee · Jens Kleesiek · Michal Kozubek · Mauricio Reyes · Michael A. Riegler · Manuel Wiesenfarth · Michael Baumgartner · Matthias Eisenmann · Doreen Heckmann-Nötzel · A. Kavur · Tim Rädsch · Laura Acion · Michela Antonelli · Tal Arbel · Spyridon Bakas · Pete Bankhead · Arriel Benis · Florian Buettner · M. Jorge Cardoso · Veronika Cheplygina · Beth Cimini · Gary Collins · Keyvan Farahani · Luciana Ferrer · Adrian Galdran · Bram van Ginneken · Robert Haase · Daniel Hashimoto · Michael Hoffman · Merel Huisman · Pierre Jannin · Charles Kahn · Dagmar Kainmueller · Alexandros Karargyris · Bernhard Kainz · Alan Karthikesalingam · Hannes Kenngott · Florian Kofler · Annette Kopp-Schneider · Anna Kreshuk · Tahsin Kurc · Bennett Landman · Geert Litjens · Amin Madani · Klaus H. Maier-Hein · Anne Martel · Peter Mattson · Erik Meijering · Bjoern Menze · David Moher · Karel G.M. Moons · Henning Mueller · Brennan Nichyporuk · Felix Nickel · Jens Petersen · Nasir Rajpoot · Nicola Rieke · Julio Saez-Rodriguez · Clarisa Sanchez · Shravya Shetty · Maarten van Smeden · Carole Sudre · Ronald Summers · Abdel Aziz Taha · Sotirios Tsaftaris · Ben Ben Van Calster · Gaël Varoquaux · Paul Jäger

Flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. A large international expert consortium now created Metrics Reloaded, a comprehensive framework guiding researchers towards problem-aware metric selection. The framework is based on the novel concept of a problem fingerprint - a structured representation of the given problem that captures all aspects relevant for metric selection, from the domain interest to properties of the target structure(s), data set and algorithm output. It supports image-level classification, object detection, semantic and instance segmentation tasks. Users are guided through the process of selecting and applying appropriate validation metrics while being made aware of pitfalls. To improve the user experience, we implemented the framework in an online tool, which also provides a common point of access to explore metric weaknesses and strengths. An instantiation of the framework for various biomedical image analysis use cases demonstrates its broad applicability across domains.

Author Information

Annika Reinke (German Cancer Research Center)
Lena Maier-Hein (German Cancer Research Center (dkfz))
Patrick Scholz (German Cancer Research Center)
Minu D. Tizabi (German Cancer Research Center (DKFZ))
Evangelia Christodoulou (German Cancer Research Center)
Ben Glocker (Imperial College London)
Fabian Isensee (German Cancer Research Center (DKFZ))
Jens Kleesiek (Institute for AI in Medicine (IKIM), University Medicine Essen)
Michal Kozubek (Masaryk University)
Mauricio Reyes (University of Bern)
Michael A. Riegler (SimulaMet)
Manuel Wiesenfarth (German Cancer Research Center)
Michael Baumgartner (German Cancer Research Center (DKFZ))
Matthias Eisenmann (German Cancer Research Center (DKFZ))
Doreen Heckmann-Nötzel (German Cancer Research Center)
A. Kavur
Tim Rädsch (German Cancer Research Center (DKFZ))
Laura Acion (University of Buenos Aires - CONICET)
Michela Antonelli (King's College London)
Tal Arbel (McGill University)
Spyridon Bakas (University of Pennsylvania)
Pete Bankhead (University of Edinburgh)
Arriel Benis (Holon Institute of Technology)
Florian Buettner (Goethe University Frankfurt German Cancer Research Center (DKFZ))
M. Jorge Cardoso (King's College London)
Veronika Cheplygina (IT University of Copenhagen)
Beth Cimini (Broad Institute of MIT and Harvard)
Gary Collins (University of Oxford)
Keyvan Farahani (Division of Cancer Treatment and Diagnosis, National Cancer Institute)
Luciana Ferrer (CONICET - University of Buenos Aires)

Luciana Ferrer is a researcher at the Computer Science Institute, from the National Scientific and Technical Research Council (CONICET) and the University of Buenos Aires (UBA), Argentina. Prior to her current position, Luciana worked at the Speech Technology and Research Laboratory, SRI International, USA. Her current research interests include speaker and language identification, mental state detection, and pronunciation scoring for second language learning. Luciana received the B.S. degree from the University of Buenos Aires, Argentina, in 2001, and her Ph.D. degree from Stanford University, USA, in 2009.

Adrian Galdran (Universitat Pompeu Fabra)
Bram van Ginneken (Radboud University)
Robert Haase (DFG Cluster of Excellence „Physics of Life" and Center for Systems Biology)
Daniel Hashimoto (University of Pennsylvania)
Michael Hoffman (University Health Network/University of Toronto)
Merel Huisman (Meander Medisch Centrum)
Pierre Jannin ("Universit� de Rennes 1, France")
Charles Kahn (University of Pennsylvania)
Dagmar Kainmueller (BIH/MDC)
Alexandros Karargyris (IHU Strasbourg)
Bernhard Kainz (Imperial College London,)
Bernhard Kainz

I am Professor at Friedrich-Alexander-University Erlangen-Nuremberg where I head the Image Data Exploration and Analysis Lab (IDEA Lab) and I am Reader (= US/EU Associate Professor++) in the Department of Computing at Imperial College London where I lead the human-in-the-loop computing group and co-lead the biomedical image analysis research group (BioMedIA). We are a post-pandemic, borderless research group, across nations and institutions. Our research is about intelligent algorithms in healthcare, especially Medical Imaging. We are working on self-driving medical image acquisition that can guide human operators in real-time during diagnostics. Artificial Intelligence is currently used as a blanket term to describe research in these areas.

Alan Karthikesalingam (Google)
Hannes Kenngott (University of Heidelberg)
Florian Kofler (Helmholtz AI TU Munich)
Annette Kopp-Schneider
Anna Kreshuk (EMBL)
Tahsin Kurc (Stony Brook University)
Bennett Landman (Vanderbilt University)
Geert Litjens (Radboud University Nijmegen Medical Center)

Geert Litjens studied Biomedical Engineering at Eindhoven University of Technology. Subsequently, he completed his PhD in the Diagnostic Image Analysis Group. He worked with Henkjan Huisman on Computer-aided detection of prostate cancer. He spent 2015 as a postdoctoral researcher at the National Center for Tumor Diseases in Heidelberg, Germany on an Alexander von Humboldt Society Postdoctoral Fellowship. He is currently an Assistant Professor in Computational Pathology at the Department of Pathology. His research focus is applying machine learning to solve important questions in oncology: - How to improve efficiency and accuracy through automation of diagnostics? - How to quantify (un)known biomarkers for cancer progression and treatment success using machine learning For more details on his research group: https://www.computationalpathologygroup.eu/

Amin Madani (University Health Network)
Klaus H. Maier-Hein (German Cancer Research Center (DKFZ))
Anne Martel (Sunnybrook Research Institute, Toronto)
Peter Mattson (Google)

Leads ML Performance Metrics team at Google Brain. General Chair of MLPerf. Ph.D. Stanford University.

Erik Meijering (University of New South Wales)
Bjoern Menze (TU Munich)
David Moher (Ottawa Hospital Research Institute and University of Ottawa, )
Karel G.M. Moons (UMC Utrecht, University Utrecht)
Henning Mueller (HES-SO)
Brennan Nichyporuk (Mila)
Felix Nickel (University Hospital of Heidelberg)
Jens Petersen (German Cancer Research Center (DKFZ))
Nasir Rajpoot (University of Warwick)
Nicola Rieke (Nvidia)
Julio Saez-Rodriguez (Heidelberg University)
Clarisa Sanchez (University of Amsterdam)
Shravya Shetty (Google, LLC)
Maarten van Smeden (University Medical Center Utrecht)
Carole Sudre (King's College London)
Ronald Summers (NIH)
Abdel Aziz Taha (Data Science Studio, Research Studios Austria FG, Vienna, Austria)
Sotirios Tsaftaris (University of Edinburgh)
Ben Ben Van Calster (Katholieke Universiteit (KU) Leuven)
Gaël Varoquaux (INRIA)
Paul Jäger (DKFZ)

More from the Same Authors