Skip to yearly menu bar Skip to main content


Poster

From Causal to Concept-Based Representation Learning

Goutham Rajendran · Simon Buchholz · Bryon Aragam · Bernhard Schölkopf · Pradeep Ravikumar

[ ]
Thu 12 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

To build intelligent machine learning systems, modern representation learning attempts to recover latent generative factors from data, such as in causal representation learning. A key question in this growing field is to provide rigorous conditions under which latent factors can be identified and thus, potentially learned. Motivated by extensive empirical literature on linear representations and concept learning, we propose to relax causal notions with a geometric notion of concepts. We formally define a notion of concepts and show rigorously that they can be provably recovered from diverse data. Instead of imposing assumptions on the "true" generative latent space, we assume that concepts can be represented linearly in this latent space. The tradeoff is that instead of identifying the "true" generative factors, we identify a subset of desired human-interpretable concepts that are relevant for a given application. Experiments on synthetic data, multimodal CLIP models and large language models supplement our results and show the utility of our approach. In this way, we provide a foundation for moving from causal representations to interpretable, concept-based representations by bringing together ideas from these two neighboring disciplines.

Live content is unavailable. Log in and register to view live content