Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Symmetry and Geometry in Neural Representations (NeurReps)

Object-centric causal representation learning

Amin Mansouri · Jason Hartford · Kartik Ahuja · Yoshua Bengio

Keywords: [ Disentanglement ] [ object centric learning ] [ Representation Learning ]


Abstract: There has been significant recent progress in causal representation learning that has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees (up to some reasonable equivalence class). Common to all of these approaches is the assumption that (1) the latent variables are $d-$dimensional vectors, and (2) that the observations are the output of some injective observation function of these latent variables. While these assumptions appear benign–they amount to assuming that any changes in the latent space are reflected in the observation space, and that we can use standard encoders to infer the latent variables–we show that when the observations are of multiple objects, the observation function is no longer injective, and disentanglement fails in practice. We can address this failure by combining recent developments in object-centric learning and causal representation learning. By modifying the Slot Attention architecture \citep{Locatello2020}, we develop an object-centric architecture that leverages weak supervision from sparse perturbations to disentangle each object's properties. We argue that this approach is more data-efficient in the sense that it requires significantly fewer perturbations than a comparable approach that encodes to a Euclidean space and, we show that this approach successfully disentangles the properties of a set of objects in a series of simple image-based disentanglement experiments.

Chat is not available.