Timezone: »

Workshop
Workshop on neuro Causal and Symbolic AI (nCSI)
Matej Zečević · Devendra Dhami · Christina Winkler · Thomas Kipf · Robert Peharz · Petar Veličković

Fri Dec 09 04:00 AM -- 01:00 PM (PST) @ Virtual

Understanding causal interactions is central to human cognition and thereby a central quest in science, engineering, business, and law. Developmental psychology has shown that children explore the world in a similar way to how scientists do, asking questions such as “What if?” and “Why?” AI research aims to replicate these capabilities in machines. Deep learning in particular has brought about powerful tools for function approximation by means of end-to-end traininable deep neural networks. This capability has been corroborated by tremendous success in countless applications. However, their lack of interpretability and reasoning capabilities prove to be a hindrance towards building systems of human-like ability. Therefore, enabling causal reasoning capabilities in deep learning is of critical importance for research on the path towards human-level intelligence. First steps towards neural-causal models exist and promise a vision of AI systems that perform causal inferences as efficiently as modern-day neural models. Similarly, classical symbolic methods are being revisited and reintegrated into current systems to allow for reasoning capabilities beyond pure pattern recognition. The Pearlian formalization to causality has revealed a theoretically sound and practically strict hierarchy of reasoning that serves as a helpful benchmark for evaluating the reasoning capabilities of neuro-symbolic systems.

Our aim is to bring together researchers interested in the integration of research areas in artificial intelligence (general machine and deep learning, symbolic and object-centric methods, and logic) with rigorous formalizations of causality with the goal of developing next-generation AI systems.

 Fri 4:00 a.m. - 4:10 a.m. Welcome & Opening Remarks (Opening) Matej Zečević 🔗 Fri 4:10 a.m. - 4:40 a.m. Opening Keynote for nCSI (Invited Speaker) Judea Pearl 🔗 Fri 4:40 a.m. - 4:50 a.m. GlanceNets: Interpretable, Leak-proof Concept-based Models (Oral)  link »    There is growing interest in concept-based models (CBMs) that combine high performance and interpretability by acquiring and reasoning with a vocabulary of high-level concepts. A key requirement is that the concepts be interpretable. Existing CBMs tackle this desideratum using a variety of heuristics based on unclear notions of interpretability, and fail to acquire concepts with the intended semantics. We address this by providing a clear definition of interpretability in terms of alignment between the model’s representation and an underlying data generation process, and introduce GlanceNets, a new CBM that exploits techniques from causal disentangled representation learning and open-set recognition to achieve alignment, thus improving the interpretability of the learned concepts. We show that GlanceNets, paired with concept-level supervision, achieve better alignment than state-of-the-art approaches while preventing spurious information from unintendedly leaking into the learned concepts. Link » Emanuele Marconato · Andrea Passerini · Stefano Teso 🔗 Fri 4:50 a.m. - 5:00 a.m. Meaning without reference in large language models (Oral)  link »    The widespread success of large language models (LLMs) has been met with skepticism that they possess anything like human concepts or meanings. Contrary to claims that LLMs possess no meaning whatsoever, we argue that they likely capture important aspects of meaning, and moreover work in a way that approximates a compelling account of human cognition in which meaning arises from conceptual role. Because conceptual role is defined by the relationships between internal representational states, meaning cannot be determined from a model's architecture, training data, or objective function, but only by examination of how its internal states relate to each other. This approach may clarify why and how LLMs are so successful and suggest how they can be made more human-like. Link » Steven Piantadosi · Felix Hill 🔗 Fri 5:00 a.m. - 6:30 a.m. Virtual Poster Session 1 (Poster Session)  link » 🔗 Fri 6:30 a.m. - 7:00 a.m. Neural Models with Symbolic Representations for Perceptuo-Reasoning Tasks (Invited Speaker) - Mausam 🔗 Fri 7:00 a.m. - 7:30 a.m. Causal Inference from Text (Invited Speaker) Dhanya Sridhar 🔗 Fri 7:30 a.m. - 8:30 a.m. Break 🔗 Fri 8:30 a.m. - 8:40 a.m. Unlocking Slot Attention by Changing Optimal Transport Costs (Oral)  link »    Slot attention is a successful method for object-centric modeling with images and videos for tasks like unsupervised object discovery. However, set-equivariance limits its ability to perform tiebreaking, which makes distinguishing similar structures difficult – a task crucial for vision problems. To fix this, we cast cross-attention in slot attention as an optimal transport (OT) problem that has solutions with the desired tiebreaking properties. We then propose an entropy minimization module that combines the tiebreaking properties of unregularized OT with the speed of regularized OT. We evaluate our method on CLEVR object detection and observe significant improvements from 53% to 91% on a strict average precision metric. Link » Yan Zhang · David Zhang · Simon Lacoste-Julien · Gertjan Burghouts · Cees Snoek 🔗 Fri 8:40 a.m. - 8:50 a.m. Interventional Causal Representation Learning (Oral)  link »    The theory of identifiable representation learning aims to build general-purpose methods that extract high-level latent (causal) factors from low-level sensory data. Most existing works focus on identifiable representation learning with observational data, relying on distributional assumptions on latent (causal) factors. However, in practice, we often also have access to interventional data for representation learning, e.g. from robotic manipulation experiments in robotics, from genetic perturbation experiments in genomics, or from electrical stimulation experiments in neuroscience. How can we leverage interventional data to help identify high-level latents? To this end, we explore the role of interventional data for identifiable representation learning in this work. We study the identifiability of latent causal factors with and without interventional data, under minimal distributional assumptions on latents. We prove that, if the true latent maps to the observed high-dimensional data via a polynomial function, then representation learning via minimizing standard reconstruction loss (used in autoencoders) can identify the true latents up to affine transformation. If we further have access to interventional data generated by hard $do$ interventions on some latents, then we can identify these intervened latents up to permutation, shift and scaling. Link » Kartik Ahuja · Yixin Wang · Divyat Mahajan · Yoshua Bengio 🔗 Fri 8:50 a.m. - 9:20 a.m. Representation Learning and Causality (Invited Speaker) Jovana Mitrovic 🔗 Fri 9:20 a.m. - 10:30 a.m. Virtual Poster Session 2 (Poster Session)  link » 🔗 Fri 10:30 a.m. - 11:00 a.m. A Counterfactual Simulation Model of Causal Judgment (Invited Speaker) Tobias Gerstenberg 🔗 Fri 11:00 a.m. - 11:30 a.m. AI can learn from data. But can it learn to reason? (Invited Speaker) Guy Van den Broeck 🔗 Fri 11:30 a.m. - 12:00 p.m. Break 🔗 Fri 12:00 p.m. - 12:50 p.m. Panel Discussion: "Heading for a Unifying View on nCSI" (Panel) Tobias Gerstenberg · Sriraam Natarajan · - Mausam · Guy Van den Broeck · Devendra Dhami 🔗 Fri 12:50 p.m. - 1:00 p.m. Closing Remarks (Closing) Matej Zečević 🔗 - Synthesized Differentiable Programs (Poster)  link » Program synthesis algorithms produce interpretable and generalizable code that captures input data but are not directly amenable to continuous optimization using gradient descent.In theory, any program can be represented in a Turing complete neural network model, which implies that it is possible to compile syntactic programs into the weights of a neural network by using a technique known as \textit{neural compilation}.This paper presents a combined algorithm for synthesizing syntactic programs, compiling them into the weights of a neural network, and then tuning the resulting model. This paper's experiments establish that program synthesis, neural compilation, and differentiable optimization together form an efficient algorithm for inducing abstract algorithmic structure and a corresponding local set of desirable complex programs Link » Lucas Saldyt 🔗 - Probabilities of Causation: Adequate Size of Experimental and Observational Samples (Poster)  link » The probabilities of causation are commonly used to solve decision-making problems. Tian and Pearl derived sharp bounds for the probability of necessity and sufficiency (PNS), the probability of sufficiency (PS), and the probability of necessity (PN) using experimental and observational data. The assumption is that one is in possession of a large enough sample to permit an accurate estimation of the experimental and observational distributions. In this study, we present a method for determining the sample size needed for such estimation, when a given confidence interval (CI) is specified. We further show by simulation that the proposed sample size delivered stable estimations of the bounds of PNS. Link » Ang Li · Ruirui Mao · Judea Pearl 🔗 - Discrete Learning Of DAGs Via Backpropagation (Poster)  link » Recently continuous relaxations have been proposed in order to learn directed acyclic graphs (DAGs) by backpropagation, instead of combinatorial optimization. However, a number of techniques for fully discrete backpropagation could instead be applied. In this paper, we explore this direction and propose DAG-DB, a framework for learning DAGs by Discrete Backpropagation, based on the architecture of Implicit Maximum Likelihood Estimation (I-MLE). DAG-DB performs competitively using either of two fully discrete backpropagation techniques, I-MLE itself, or straight-through estimation. Link » Andrew Wren · Pasquale Minervini · Luca Franceschi · Valentina Zantedeschi 🔗 - Symbolic Causal Inference via Operations on Probabilistic Circuits (Poster)  link » Causal inference provides a means of translating a target causal query into a causal formula, which is a function of the observational distribution, given some assumptions on the domain. With the advent of modern neural probabilistic models, this opens up the possibility to perform accurate and tractable causal inference on realistic, high-dimensional data distributions, a crucial component of reasoning systems. However, for most model classes, the computation of the causal formula from the observational model is intractable. In this work, we hypothesize that probabilistic circuits, a general and expressive class of tractable probabilistic models, may be more amenable for the computation of causal formulae. Unfortunately, we prove that evaluating even simple causal formulae is still intractable for most types of probabilistic circuits. Motivated by this, we devise a conceptual framework for analyzing the tractability of causal formulae by decomposing them into compositions of primitive operations, in order to identify tractable subclasses of circuits. This allows us to derive, for a specific subclass of circuits, the first tractable algorithms for computing the backdoor and frontdoor adjustment formulae. Link » Benjie Wang · Marta Kwiatkowska 🔗 - Benchmarking Counterfactual Reasoning Abilities about Implicit Physical Properties (Poster)  link » Videos often capture objects, their motion, and the interactions between different objects. Although real-world objects have physical properties associated with them, many of these properties (such as mass and coefficient of friction) are not captured directly by the imaging pipeline. However, these properties can be estimated by utilizing cues from relative object motion and the dynamics introduced by collisions. In this paper, we introduce a new video question answering task for reasoning about the implicit physical properties of objects in a scene, from videos. For this task, we introduce a dataset -- CRIPP-VQA, which contains videos of objects in motion, annotated with hypothetical/counterfactual questions about the effect of actions (such as removing, adding, or replacing objects), questions about planning (choosing actions to perform in order to reach a particular goal), as well as descriptive questions about the visible properties of objects. We benchmark the performance of existing deep learning based video question answering models on CRIPP-VQA (Counterfactual Reasoning about Implicit Physical Properties). Our experiments reveal a surprising and significant performance gap in terms of answering questions about implicit properties (the focus of this paper) and explicit properties (the focus of prior work) of objects (as shown in Table 1). Link » Maitreya Patel · Tejas Gokhale · Chitta Baral · 'YZ' Yezhou Yang 🔗 - Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement (Poster)  link » Object rearrangement is a challenge for embodied agents because solving these tasks requires generalizing across a combinatorially large set of underlying entities that take the value of object states. Worse, these entities are often unknown and must be inferred from sensory percepts. We present a hierarchical abstraction approach to uncover these underlying entities and achieve combinatorial generalization from unstructured inputs. By constructing a factorized transition graph over clusters of object representations inferred from pixels, we show how to learn a correspondence between intervening on states of entities in the agent's model and acting on objects in the environment. We use this correspondence to develop a method for control that generalizes to different numbers and configurations of objects, which outperforms current offline deep RL methods when evaluated on a set of simulated rearrangement and stacking tasks. Link » Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang 🔗 - Enhancing Transfer of Reinforcement Learning Agents with Abstract Contextual Embeddings (Poster)  link » Deep reinforcement learning (DRL) algorithms have seen great success in performing a plethora of tasks, but often have trouble adapting to changes in the environment. We address this issue by using {\em reward machines} (RM), a graph-based abstraction of the underlying task to represent the current setting or {\em context}. Using a graph neural network (GNN), we embed the RMs into deep latent vector representations and provide it to the agent to enhance its ability to adapt to new contexts. To the best of our knowledge, this is the first work to embed contextual abstractions and let the agent decide how to use them. Our preliminary empirical evaluation demonstrates improved sample efficiency of our approach upon context transfer on a set of grid navigation tasks. Link » Guy Azran · Mohamad Hosein Danesh · Stefano Albrecht · Sarah Keren 🔗 - Playgrounds for Abstraction and Reasoning (Poster)  link » While research on reasoning using large models is in the spotlight, a symbolic method of making a compact model capable of reasoning is also attracting public attention. We introduce the Mini-ARC dataset, a 5x5 compact version of the Abstraction and Reasoning Corpus (ARC) to measure the abductive reasoning capability. The dataset is small but creative, which maintains the difficulty of the original dataset but improves usability for model training. Along with Mini-ARC, we introduce the O2ARC interface, which includes richer features for humans to solve the ARC tasks. By solving Mini-ARC with O2ARC, we collect human trajectories called Mini-ARC traces, which are potentially helpful in developing an AI with reasoning capability. Link » Subin Kim · Prin Phunyaphibarn · Donghyun Ahn · Sundong Kim 🔗 - Causal Discovery for Modular World Models (Poster)  link » Latent world models allow agents to reason about complex environments with high-dimensional observations. However, adapting to new environments and effectively leveraging previous knowledge remain significant challenges. We present variational causal dynamics (VCD), a structured world model that exploits the invariance of causal mechanisms across environments to achieve fast and modular adaptation. VCD identifies reusable components across different environments by combining causal discovery and variational inference to learn a latent representation and transition model jointly in an unsupervised manner. In evaluations on simulated environments with image observations, we show that VCD is able to successfully identify causal variables. Moreover, given a small number of observations in a previously unseen, intervened environment, VCD is able to identify the sparse changes in the dynamics and to adapt efficiently. In doing so, VCD significantly extends the capabilities of the current state-of-the-art in latent world models. Link » Anson Lei · Bernhard Schölkopf · Ingmar Posner 🔗 - Active Bayesian Causal Inference (Poster)  link » Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, such a two-stage approach is uneconomical, especially in terms of actively collected interventional data, since the causal query of interest may not require a fully-specified causal model. From a Bayesian perspective, it is natural to treat a causal query (e.g., the causal graph or some causal effect) as subject to posterior inference while other unobserved quantities ought to be marginalized out.In this work, we propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning, which jointly infers a posterior over causal models and queries of interest.ABCI sequentially designs experiments that are maximally informative about the target causal query, collects the corresponding interventional data, and updates the Bayesian beliefs to choose the next experiment. Through simulations, we demonstrate that our approach is more data-efficient than several baselines that only focus on learning the full causal graph. This allows us to accurately learn downstream causal queries from fewer samples while providing well-calibrated uncertainty estimates for the quantities of interest. Link » Christian Toth · Lars Lorch · Christian Knoll · Andreas Krause · Franz Pernkopf · Robert Peharz · Julius von Kügelgen 🔗 - Learning Neuro-symbolic Programs for Language-Guided Robotic Manipulation (Poster)  link » Given a natural language instruction, and an input and an output scene, our goal is to train a neuro-symbolic model which can output a manipulation program that can be executed by the robot on the input scene resulting in the desired output scene. Prior approaches for this task possess one of the following limitations: (i) rely on hand-coded symbols for concepts limiting generalization beyond those seen during training (R. Paul et. al., 2016) (ii) infer action sequences from instructions but require dense sub-goal supervision (C. Paxton et. al., 2019) or (iii) lack semantics required for deeper object- centric reasoning inherent in interpreting complex instructions (M. Shridhar et. al., 2022). In contrast, our approach is neuro-symbolic and can handle linguistic as well as perceptual variations, is end-to-end differentiable requiring no intermediate supervision, and makes use of symbolic reasoning constructs which operate on a latent neural object- centric representation, allowing for deeper reasoning over the input scene. Our experiments on a simulated environment with a 7-DOF manipulator, consisting of instructions with varying number of steps, as well as scenes with different number of objects, and objects with unseen attribute combinations, demonstrate that our model is robust to such variations, and significantly outperforms existing baselines, particularly in generalization settings. Link » Namasivayam Kalithasan · Himanshu Singh · Vishal Bindal · Arnav Tuli · Vishwajeet Agrawal · Rahul Jain · Parag Singla · Rohan Paul 🔗 - Image Manipulation via Neuro-Symbolic Networks (Poster)  link » We are interested in image manipulation via natural language text – a task that is extremely useful for multiple AI applications but requires complex reasoning over multi-modal spaces. Recent work on neuro-symbolic approaches has been quite effective in solving such tasks as they offer better modularity, interpretability, and generalizability. A noteworthy such approach is NSCL [25] developed for the task of Visual Question Answering (VQA). We extend NSCL for the image manipulation task and propose a solution referred to as NeuroSIM. Unlike previous works, which either require supervised data training or can only deal with very simple reasoning instructions over single object scenes; NeuroSIM can perform complex multi-hop reasoning over multi-object scenes and requires only weak supervision in the form of annotated data for the VQA task. On the language side, NeuroSIM contains neural modules that parse an instruction into a symbolic program that guides the manipulation. These programs are based on a Domain Specific Language (DSL) comprising object attributes as well as manipulation operations. On the perceptual side, NeuroSIM contains neural modules which first generate a scene graph of the input image and then change the scene graph representation in accordance with the parsed instruction. To train these modules, we design novel loss functions that are capable of testing the correctness of manipulated object and scene graph representations via query networks that are trained merely on the VQA dataset. An image decoder is trained to render the final image from the manipulated scene graph representation. The entire NeuroSIM pipeline is trained without any intermediate supervision. Extensive experiments demonstrate that our approach is highly competitive with state-of-the-art supervised baselines. Link » Harman Singh · Poorva Garg · Mohit Gupta · Kevin Shah · Arnab Kumar Mondal · Dinesh Khandelwal · Parag Singla · Dinesh Garg 🔗 - Counterfactual reasoning: Do Language Models need world knowledge for causal inference? (Poster)  link » Current pre-trained language models have enabled remarkable improvements in downstream tasks, but it remains difficult to distinguish effects of statistical correlation from more systematic logical reasoning grounded on understanding of the real world. In this paper we tease these factors apart by leveraging counterfactual conditionals, which force language models to predict unusual consequences based on hypothetical propositions. We introduce a set of tests drawn from psycholinguistic experiments, as well as larger-scale controlled datasets, to probe counterfactual predictions from a variety of popular pre-trained language models. We find that models are consistently able to override real-world knowledge in counterfactual scenarios, and that this effect is more robust in case of stronger baseline world knowledge---however, we also find that for most models this effect appears largely to be driven by simple lexical cues. When we mitigate effects of both world knowledge and lexical cues to test knowledge of linguistic nuances of counterfactuals, we find that only GPT-3 shows sensitivity to these nuances, though this sensitivity is also non-trivially impacted by lexical associative factors. Link » Jiaxuan Li · Lang Yu · Allyson Ettinger 🔗 - Graphs, Constraints, and Search for the Abstraction and Reasoning Corpus (Poster)  link » The Abstraction and Reasoning Corpus (ARC) aims at benchmarking the performance of general artificial intelligence algorithms. The ARC's focus on broad generalization and few-shot learning has made it impossible to solve using pure machine learning. A more promising approach has been to perform program synthesis within an appropriately designed Domain Specific Language (DSL). However, these too have seen limited success. We propose Abstract Reasoning with Graph Abstractions (ARGA), a new object-centric framework that first represents images using graphs and then performs a constraint-guided search for a correct program in a DSL that is based on the abstracted graph space. Early experiments demonstrates the promise of ARGA in tackling some of the complicated tasks of the ARC rather efficiently, producing programs that are correct and easy to understand. Link » Yudong Xu · Elias Khalil · Scott Sanner 🔗 - The Impact of Symbolic Representations on In-context Learning for Few-shot Reasoning (Poster)  link » Pre-trained language models (LMs) have shown remarkable reasoning performance using explanations for in-context learning. On the other hand, those reasoning tasks are usually presumed to be more approachable for symbolic programming. To make progress towards understanding in-context learning, we revisit neuro-symbolic approaches and design a model LMLP that learns from demonstrations containing logic rules and corresponding examples to iteratively reason over knowledge bases (KBs). Such a procedure makes explicit correspondence between LMs' outputs and predicates in the KBs to recover Prolog’s backward chaining algorithm. Comprehensive experiments are included to systematically compare LMLP with their natural language counterparts like chain-of-thought'' (CoT) in deductive and inductive reasoning settings, which demonstrates that LMLP enjoys much better efficiency and length generalization in various settings. Link » Hanlin Zhang · yifan zhang · Li Erran Li · Eric Xing 🔗 - GlanceNets: Interpretable, Leak-proof Concept-based Models (Poster)  link » There is growing interest in concept-based models (CBMs) that combine high performance and interpretability by acquiring and reasoning with a vocabulary of high-level concepts. A key requirement is that the concepts be interpretable. Existing CBMs tackle this desideratum using a variety of heuristics based on unclear notions of interpretability, and fail to acquire concepts with the intended semantics. We address this by providing a clear definition of interpretability in terms of alignment between the model’s representation and an underlying data generation process, and introduce GlanceNets, a new CBM that exploits techniques from causal disentangled representation learning and open-set recognition to achieve alignment, thus improving the interpretability of the learned concepts. We show that GlanceNets, paired with concept-level supervision, achieve better alignment than state-of-the-art approaches while preventing spurious information from unintendedly leaking into the learned concepts. Link » Emanuele Marconato · Andrea Passerini · Stefano Teso 🔗 - Interventional Causal Representation Learning (Poster)  link » The theory of identifiable representation learning aims to build general-purpose methods that extract high-level latent (causal) factors from low-level sensory data. Most existing works focus on identifiable representation learning with observational data, relying on distributional assumptions on latent (causal) factors. However, in practice, we often also have access to interventional data for representation learning, e.g. from robotic manipulation experiments in robotics, from genetic perturbation experiments in genomics, or from electrical stimulation experiments in neuroscience. How can we leverage interventional data to help identify high-level latents? To this end, we explore the role of interventional data for identifiable representation learning in this work. We study the identifiability of latent causal factors with and without interventional data, under minimal distributional assumptions on latents. We prove that, if the true latent maps to the observed high-dimensional data via a polynomial function, then representation learning via minimizing standard reconstruction loss (used in autoencoders) can identify the true latents up to affine transformation. If we further have access to interventional data generated by hard interventions on some latents, then we can identify these intervened latents up to permutation, shift and scaling. Link » Kartik Ahuja · Yixin Wang · Divyat Mahajan · Yoshua Bengio 🔗 - Unlocking Slot Attention by Changing Optimal Transport Costs (Poster)  link » Slot attention is a successful method for object-centric modeling with images and videos for tasks like unsupervised object discovery. However, set-equivariance limits its ability to perform tiebreaking, which makes distinguishing similar structures difficult – a task crucial for vision problems. To fix this, we cast cross-attention in slot attention as an optimal transport (OT) problem that has solutions with the desired tiebreaking properties. We then propose an entropy minimization module that combines the tiebreaking properties of unregularized OT with the speed of regularized OT. We evaluate our method on CLEVR object detection and observe significant improvements from 53% to 91% on a strict average precision metric. Link » Yan Zhang · David Zhang · Simon Lacoste-Julien · Gertjan Burghouts · Cees Snoek 🔗