Timezone: »
Learning causal structure is useful in many areas of artificial intelligence, such as planning, robotics, and explanation. Constraint-based and hybrid structure learning algorithms such as PC use conditional independence (CI) tests to learn a causal structure. Traditionally, constraint-based algorithms perform the CI tests with a preference for smaller-sized conditioning sets, partially because the statistical power of conventional CI tests declines substantially as the size of the conditioning set increases. However, many modern conditional independence tests are \textit{model-based}, and these tests use well-regularized models that can perform well even with very large conditioning sets. This suggests an intriguing new strategy for constraint-based algorithms which may result in a reduction of the total number of CI tests performed: Test variable pairs with \textit{large} conditioning sets \textit{first}, as a pre-processing step that finds some conditional independencies quickly, before moving on to the more conventional strategy of testing with incrementally larger conditioning sets of sizes (beginning with marginal independence tests). We propose such a pre-processing step for the PC algorithm which relies on performing CI tests on a few randomly selected large conditioning sets. We perform an empirical analysis on directed acyclic graphs (DAGs) that correspond to real-world systems and both an empirical and theoretical analysis for Erd\H{o}s-Renyi DAGs. Our results show that the PC algorithm with our pre-processing step performs far fewer CI tests than the original PC algorithm, between 0.5\% and 20\%, of the CI tests that the PC algorithm alone performs. The efficiency gains are particularly significant for the DAGs corresponding to real-world systems.
Author Information
Erica Cai (College of Information and Computer Sciences, University of Massachusetts at Amherst)
Andrew McGregor (University of Massachusetts Amherst)
David Jensen (University of Massachusetts Amherst)
David Jensen is a Professor of Computer Science at the University of Massachusetts Amherst. He directs the Knowledge Discovery Laboratory and currently serves as the Director of the Computational Social Science Institute, an interdisciplinary effort at UMass to study social phenomena using computational tools and concepts. From 1991 to 1995, he served as an analyst with the Office of Technology Assessment, an agency of the United States Congress. His current research focuses on methods for constructing accurate causal models from observational and experimental data. He regularly serves on program committees for several conferences, including the Conference on Neural Information Processing Systems, the International Conference on Machine Learning, and the Conference on Uncertainty in Artificial Intelligence. He has served on the Board of Directors of the ACM Special Interest Group on Knowledge Discovery and Data Mining (2005-2013), the Defense Science Study Group (2006-2007), and DARPA's Information Science and Technology Group (2007-2012). In 2011, he received the Outstanding Teacher Award from the UMass College of Natural Sciences.
More from the Same Authors
-
2022 Poster: Estimation of Entropy in Constant Space with Improved Sample Complexity »
Maryam Aliakbarpour · Andrew McGregor · Jelani Nelson · Erik Waingarten -
2020 : Prof. David Jensen (University of Massachusetts Amherst) - "Empirical Research in Machine Learning: Perspectives and Strategies" »
David Jensen -
2019 Poster: Sample Complexity of Learning Mixture of Sparse Linear Regressions »
Akshay Krishnamurthy · Arya Mazumdar · Andrew McGregor · Soumyabrata Pal -
2019 Poster: The Case for Evaluating Causal Models Using Interventional Measures and Empirical Data »
Amanda Gentzel · Dan Garant · David Jensen -
2018 : Poster Session 1 (note there are numerous missing names here, all papers appear in all poster sessions) »
Akhilesh Gotmare · Kenneth Holstein · Jan Brabec · Michal Uricar · Kaleigh Clary · Cynthia Rudin · Sam Witty · Andrew Ross · Shayne O'Brien · Babak Esmaeili · Jessica Forde · Massimo Caccia · Ali Emami · Scott Jordan · Bronwyn Woods · D. Sculley · Rebekah Overdorf · Nicolas Le Roux · Peter Henderson · Brandon Yang · Tzu-Yu Liu · David Jensen · Niccolo Dalmasso · Weitang Liu · Paul Marc TRICHELAIR · Jun Ki Lee · Akanksha Atrey · Matt Groh · Yotam Hechtlinger · Emma Tosch -
2018 Poster: Compact Representation of Uncertainty in Clustering »
Craig Greenberg · Nicholas Monath · Ari Kobren · Patrick Flaherty · Andrew McGregor · Andrew McCallum