Timezone: »
To perform counterfactual reasoning in Structural Causal Models (SCMs), one needs to know the causal mechanisms, which provide factorizations of conditional distributions into noise sources and deterministic functions mapping realizations of noise to samples. Unfortunately, the causal mechanism is not uniquely identified by data that can be gathered by observing and interacting with the world, so there remains the question of how to choose causal mechanisms. In recent work, Oberst & Sontag (2019) propose Gumbel-max SCMs, which use Gumbel-max reparameterizations as the causal mechanism due to an appealing counterfactual stability property. However, the justification requires appealing to intuition. In this work, we instead argue for choosing a causal mechanism that is best under a quantitative criteria such as minimizing variance when estimating counterfactual treatment effects. We propose a parameterized family of causal mechanisms that generalize Gumbel-max. We show that they can be trained to minimize counterfactual effect variance and other losses on a distribution of queries of interest, yielding lower variance estimates of counterfactual treatment effect than fixed alternatives, also generalizing to queries not seen at training time.
Author Information
Guy Lorberbom (Technion)
Daniel D. Johnson (Google Research, Brain Team)
Chris Maddison (University of Toronto)
Daniel Tarlow (Google Research, Brain team)
Tamir Hazan (Technion)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Learning Generalized Gumbel-max Causal Mechanisms »
Tue. Dec 7th 04:30 -- 06:00 PM Room
More from the Same Authors
-
2021 Spotlight: Lossy Compression for Lossless Prediction »
Yann Dubois · Benjamin Bloem-Reddy · Karen Ullrich · Chris Maddison -
2021 Spotlight: PLUR: A Unifying, Graph-Based View of Program Learning, Understanding, and Repair »
Zimin Chen · Vincent J Hellendoorn · Pascal Lamblin · Petros Maniatis · Pierre-Antoine Manzagol · Daniel Tarlow · Subhodeep Moitra -
2021 : Optimal Representations for Covariate Shifts »
Yann Dubois · Yangjun Ruan · Chris Maddison -
2022 Poster: On the Importance of Gradient Norm in PAC-Bayesian Bounds »
Itai Gat · Yossi Adi · Alex Schwing · Tamir Hazan -
2021 Workshop: Advances in Programming Languages and Neurosymbolic Systems (AIPLANS) »
Breandan Considine · Disha Shrivastava · David Yu-Tung Hui · Chin-Wei Huang · Shawn Tan · Xujie Si · Prakash Panangaden · Guy Van den Broeck · Daniel Tarlow -
2021 : Invited Talk 6 »
Chris Maddison -
2021 Poster: Structured Denoising Diffusion Models in Discrete State-Spaces »
Jacob Austin · Daniel D. Johnson · Jonathan Ho · Daniel Tarlow · Rianne van den Berg -
2021 Poster: Learning to Combine Per-Example Solutions for Neural Program Synthesis »
Disha Shrivastava · Hugo Larochelle · Daniel Tarlow -
2021 Poster: Lossy Compression for Lossless Prediction »
Yann Dubois · Benjamin Bloem-Reddy · Karen Ullrich · Chris Maddison -
2021 Poster: PLUR: A Unifying, Graph-Based View of Program Learning, Understanding, and Repair »
Zimin Chen · Vincent J Hellendoorn · Pascal Lamblin · Petros Maniatis · Pierre-Antoine Manzagol · Daniel Tarlow · Subhodeep Moitra -
2020 Poster: Learning Graph Structure With A Finite-State Automaton Layer »
Daniel D. Johnson · Hugo Larochelle · Danny Tarlow -
2020 Spotlight: Learning Graph Structure With A Finite-State Automaton Layer »
Daniel D. Johnson · Hugo Larochelle · Danny Tarlow -
2020 Poster: Gradient Estimation with Stochastic Softmax Tricks »
Max Paulus · Dami Choi · Danny Tarlow · Andreas Krause · Chris Maddison -
2020 Poster: Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies »
Itai Gat · Idan Schwartz · Alex Schwing · Tamir Hazan -
2020 Oral: Gradient Estimation with Stochastic Softmax Tricks »
Max Paulus · Dami Choi · Danny Tarlow · Andreas Krause · Chris Maddison -
2020 Poster: Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces »
Guy Lorberbom · Chris Maddison · Nicolas Heess · Tamir Hazan · Danny Tarlow -
2019 Poster: Hamiltonian descent for composite objectives »
Brendan O'Donoghue · Chris Maddison -
2019 Poster: Direct Optimization through $\arg \max$ for Discrete Variational Auto-Encoder »
Guy Lorberbom · Andreea Gane · Tommi Jaakkola · Tamir Hazan -
2019 Poster: Continuous Hierarchical Representations with PoincarĂ© Variational Auto-Encoders »
Emile Mathieu · Charline Le Lan · Chris Maddison · Ryota Tomioka · Yee Whye Teh -
2018 Poster: Latent Gaussian Activity Propagation: Using Smoothness and Structure to Separate and Localize Sounds in Large Noisy Environments »
Daniel D. Johnson · Daniel Gorelik · Ross E Mawhorter · Kyle Suver · Weiqing Gu · Steven Xing · Cody Gabriel · Peter Sankhagowit -
2017 Poster: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models »
George Tucker · Andriy Mnih · Chris J Maddison · John Lawson · Jascha Sohl-Dickstein -
2017 Oral: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models »
George Tucker · Andriy Mnih · Chris J Maddison · John Lawson · Jascha Sohl-Dickstein -
2017 Poster: Filtering Variational Objectives »
Chris Maddison · John Lawson · George Tucker · Nicolas Heess · Mohammad Norouzi · Andriy Mnih · Arnaud Doucet · Yee Teh -
2017 Poster: High-Order Attention Models for Visual Question Answering »
Idan Schwartz · Alex Schwing · Tamir Hazan -
2016 Poster: Constraints Based Convex Belief Propagation »
Yaniv Tenzer · Alex Schwing · Kevin Gimpel · Tamir Hazan -
2014 Workshop: Perturbations, Optimization, and Statistics »
Tamir Hazan · George Papandreou · Danny Tarlow -
2014 Poster: A* Sampling »
Chris Maddison · Danny Tarlow · Tom Minka -
2014 Oral: A* Sampling »
Chris Maddison · Danny Tarlow · Tom Minka -
2013 Workshop: Perturbations, Optimization, and Statistics »
Tamir Hazan · George Papandreou · Sasha Rakhlin · Danny Tarlow -
2013 Poster: Learning Efficient Random Maximum A-Posteriori Predictors with Non-Decomposable Loss Functions »
Tamir Hazan · Subhransu Maji · Joseph Keshet · Tommi Jaakkola -
2013 Poster: Annealing between distributions by averaging moments »
Roger Grosse · Chris Maddison · Russ Salakhutdinov -
2013 Oral: Annealing between distributions by averaging moments »
Roger Grosse · Chris Maddison · Russ Salakhutdinov -
2013 Poster: On Sampling from the Gibbs Distribution with Random Maximum A-Posteriori Perturbations »
Tamir Hazan · Subhransu Maji · Tommi Jaakkola -
2012 Workshop: Perturbations, Optimization, and Statistics »
Tamir Hazan · George Papandreou · Danny Tarlow -
2012 Poster: Globally Convergent Dual MAP LP Relaxation Solvers using Fenchel-Young Margins »
Alex Schwing · Tamir Hazan · Marc Pollefeys · Raquel Urtasun -
2010 Poster: A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction »
Tamir Hazan · Raquel Urtasun -
2010 Poster: Direct Loss Minimization for Structured Prediction »
David A McAllester · Tamir Hazan · Joseph Keshet