Timezone: »

Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off
Mateo Espinosa Zarlenga · Pietro Barbiero · Gabriele Ciravegna · Giuseppe Marra · Francesco Giannini · Michelangelo Diligenti · Zohreh Shams · Frederic Precioso · Stefano Melacci · Adrian Weller · Pietro Lió · Mateja Jamnik

Thu Dec 01 02:00 PM -- 04:00 PM (PST) @ Hall J #226

Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts---particularly in real-world conditions where complete and accurate concept supervisions are scarce. To address this, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations. Our experiments demonstrate that Concept Embedding Models (1) attain better or competitive task accuracy w.r.t. standard neural models without concepts, (2) provide concept representations capturing meaningful semantics including and beyond their ground truth labels, (3) support test-time concept interventions whose effect in test accuracy surpasses that in standard concept bottleneck models, and (4) scale to real-world conditions where complete concept supervisions are scarce.

Author Information

Mateo Espinosa Zarlenga (University of Cambridge)
Pietro Barbiero (University of Cambridge)
Gabriele Ciravegna (LABORATOIRE I3S UCA)
Gabriele Ciravegna

I am a Post Doc in the MAASAI (Models and Algorithms for Artificial Intelligence) research team of Inria. I received the Ph.D. degree with honours from the University of Florence in 2022 under the supervision of Professor Marco Gori. In 2018, I received a master’s degree in Computer Engineering with honours at the Polytechnic of Turin. Besides machine learning, I also like football, volleyball, and playing the piano.

Giuseppe Marra (KU Leuven)
Francesco Giannini (CINI - University of Siena)
Michelangelo Diligenti (Department of Information Engineering and Mathematical Sciences)
Zohreh Shams (Babylon Health, University of Cambridge)
Frederic Precioso (Universite Cote d'Azur)
Stefano Melacci (University of Siena)
Adrian Weller (Cambridge, Alan Turing Institute)

Adrian Weller is Programme Director for AI at The Alan Turing Institute, the UK national institute for data science and AI, where he is also a Turing Fellow leading work on safe and ethical AI. He is a Principal Research Fellow in Machine Learning at the University of Cambridge, and at the Leverhulme Centre for the Future of Intelligence where he is Programme Director for Trust and Society. His interests span AI, its commercial applications and helping to ensure beneficial outcomes for society. He serves on several boards including the Centre for Data Ethics and Innovation. Previously, Adrian held senior roles in finance.

Pietro Lió (University of Cambridge)
Mateja Jamnik (University of Cambridge)

More from the Same Authors