Timezone: »
We give a polynomial-time algorithm for provably learning the structure and parameters of bipartite noisy-or Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden variables and their causal relationships with observed data. We obtain an efficient learning algorithm for a family of Bayesian networks that we call quartet-learnable, meaning that every latent variable has four children that do not have any other parents in common. We show that the existence of such a quartet allows us to uniquely identify each latent variable and to learn all parameters involving that latent variable. Underlying our algorithm are two new techniques for structure learning: a quartet test to determine whether a set of binary variables are singly coupled, and a conditional mutual information test that we use to learn parameters. We also show how to subtract already learned latent variables from the model to create new singly-coupled quartets, which substantially expands the class of structures that we can learn. Finally, we give a proof of the polynomial sample complexity of our learning algorithm, and experimentally compare it to variational EM.
Author Information
Yacine Jernite (Hugging Face)
Yoni Halpern (Google)
David Sontag (MIT)
More from the Same Authors
-
2022 : PEST: Combining Parameter-Efficient Fine-Tuning with Self-Training and Co-Training »
Hunter Lang · Monica Agrawal · Yoon Kim · David Sontag -
2022 : BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model »
Christopher Akiki · Giada Pistilli · Margot Mieskes · Matthias Gallé · Thomas Wolf · Suzana Ilic · Yacine Jernite -
2022 : Towards Openness Beyond Open Access: User Journeys through 3 Open AI Collaboratives »
Jennifer Ding · Christopher Akiki · Yacine Jernite · Anne Steele · Temi Popo -
2022 : BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model »
Christopher Akiki · Giada Pistilli · Margot Mieskes · Matthias Gallé · Thomas Wolf · Suzana Ilic · Yacine Jernite -
2022 : Towards Openness Beyond Open Access: User Journeys through 3 Open AI Collaboratives »
Jennifer Ding · Christopher Akiki · Yacine Jernite · Anne Steele · Temi Popo -
2022 Poster: Falsification before Extrapolation in Causal Effect Estimation »
Zeshan M Hussain · Michael Oberst · Ming-Chieh Shih · David Sontag -
2022 Poster: Evaluating Robustness to Dataset Shift via Parametric Robustness Sets »
Nikolaj Thams · Michael Oberst · David Sontag -
2022 Poster: Training Subset Selection for Weak Supervision »
Hunter Lang · Aravindan Vijayaraghavan · David Sontag -
2022 Poster: The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset »
Hugo Laurençon · Lucile Saulnier · Thomas Wang · Christopher Akiki · Albert Villanova del Moral · Teven Le Scao · Leandro Von Werra · Chenghao Mou · Eduardo González Ponferrada · Huu Nguyen · Jörg Frohberg · Mario Šaško · Quentin Lhoest · Angelina McMillan-Major · Gerard Dupont · Stella Biderman · Anna Rogers · Loubna Ben allal · Francesco De Toni · Giada Pistilli · Olivier Nguyen · Somaieh Nikpoor · Maraim Masoud · Pierre Colombo · Javier de la Rosa · Paulo Villegas · Tristan Thrush · Shayne Longpre · Sebastian Nagel · Leon Weber · Manuel Muñoz · Jian Zhu · Daniel Van Strien · Zaid Alyafeai · Khalid Almubarak · Minh Chien Vu · Itziar Gonzalez-Dios · Aitor Soroa · Kyle Lo · Manan Dey · Pedro Ortiz Suarez · Aaron Gokaslan · Shamik Bose · David Adelani · Long Phan · Hieu Tran · Ian Yu · Suhas Pai · Jenny Chim · Violette Lepercq · Suzana Ilic · Margaret Mitchell · Sasha Alexandra Luccioni · Yacine Jernite -
2022 Poster: ETAB: A Benchmark Suite for Visual Representation Learning in Echocardiography »
Ahmed M. Alaa · Anthony Philippakis · David Sontag -
2021 : Training Transformers Together »
Alexander Borzunov · Max Ryabinin · Tim Dettmers · quentin lhoest · Lucile Saulnier · Michael Diskin · Yacine Jernite · Thomas Wolf -
2021 Poster: Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance »
Justin Lim · Christina Ji · Michael Oberst · Saul Blecker · Leora Horwitz · David Sontag -
2018 : TBC 13 »
David Sontag -
2018 Poster: Why Is My Classifier Discriminatory? »
Irene Chen · Fredrik Johansson · David Sontag -
2018 Spotlight: Why Is My Classifier Discriminatory? »
Irene Chen · Fredrik Johansson · David Sontag -
2017 : Invited Talk 4 »
David Sontag -
2017 Poster: Causal Effect Inference with Deep Latent-Variable Models »
Christos Louizos · Uri Shalit · Joris Mooij · David Sontag · Richard Zemel · Max Welling -
2015 Workshop: Machine Learning For Healthcare (MLHC) »
Theofanis Karaletsos · Rajesh Ranganath · Suchi Saria · David Sontag -
2015 Poster: Barrier Frank-Wolfe for Marginal Inference »
Rahul G Krishnan · Simon Lacoste-Julien · David Sontag -
2011 Poster: Complexity of Inference in Latent Dirichlet Allocation »
David Sontag · Daniel Roy -
2011 Spotlight: Complexity of Inference in Latent Dirichlet Allocation »
David Sontag · Daniel Roy -
2010 Spotlight: More data means less inference: A pseudo-max approach to structured learning »
David Sontag · Ofer Meshi · Tommi Jaakkola · Amir Globerson -
2010 Poster: More data means less inference: A pseudo-max approach to structured learning »
David Sontag · Ofer Meshi · Tommi Jaakkola · Amir Globerson -
2009 Workshop: Approximate Learning of Large Scale Graphical Models »
Russ Salakhutdinov · Amir Globerson · David Sontag -
2008 Workshop: Approximate inference - how far have we come? »
Amir Globerson · David Sontag · Tommi Jaakkola -
2008 Poster: Clusters and Coarse Partitions in LP Relaxations »
David Sontag · Amir Globerson · Tommi Jaakkola -
2008 Spotlight: Clusters and Coarse Partitions in LP Relaxations »
David Sontag · Amir Globerson · Tommi Jaakkola -
2007 Oral: New Outer Bounds on the Marginal Polytope »
David Sontag · Tommi Jaakkola -
2007 Poster: New Outer Bounds on the Marginal Polytope »
David Sontag · Tommi Jaakkola