Timezone: »

Improving the Efficiency of the PC Algorithm by Using Model-Based Conditional Independence Tests
Erica Cai · Andrew McGregor · David Jensen
Event URL: https://openreview.net/forum?id=RaUtO0rA-BI »

Learning causal structure is useful in many areas of artificial intelligence, such as planning, robotics, and explanation. Constraint-based and hybrid structure learning algorithms such as PC use conditional independence (CI) tests to learn a causal structure. Traditionally, constraint-based algorithms perform the CI tests with a preference for smaller-sized conditioning sets, partially because the statistical power of conventional CI tests declines substantially as the size of the conditioning set increases. However, many modern conditional independence tests are \textit{model-based}, and these tests use well-regularized models that can perform well even with very large conditioning sets. This suggests an intriguing new strategy for constraint-based algorithms which may result in a reduction of the total number of CI tests performed: Test variable pairs with \textit{large} conditioning sets \textit{first}, as a pre-processing step that finds some conditional independencies quickly, before moving on to the more conventional strategy of testing with incrementally larger conditioning sets of sizes (beginning with marginal independence tests). We propose such a pre-processing step for the PC algorithm which relies on performing CI tests on a few randomly selected large conditioning sets. We perform an empirical analysis on directed acyclic graphs (DAGs) that correspond to real-world systems and both an empirical and theoretical analysis for Erd\H{o}s-Renyi DAGs. Our results show that the PC algorithm with our pre-processing step performs far fewer CI tests than the original PC algorithm, between 0.5\% and 20\%, of the CI tests that the PC algorithm alone performs. The efficiency gains are particularly significant for the DAGs corresponding to real-world systems.

Author Information

Erica Cai (College of Information and Computer Sciences, University of Massachusetts at Amherst)
Andrew McGregor (University of Massachusetts Amherst)
David Jensen (University of Massachusetts Amherst)

David Jensen is a Professor of Computer Science at the University of Massachusetts Amherst. He directs the Knowledge Discovery Laboratory and currently serves as the Director of the Computational Social Science Institute, an interdisciplinary effort at UMass to study social phenomena using computational tools and concepts. From 1991 to 1995, he served as an analyst with the Office of Technology Assessment, an agency of the United States Congress. His current research focuses on methods for constructing accurate causal models from observational and experimental data. He regularly serves on program committees for several conferences, including the Conference on Neural Information Processing Systems, the International Conference on Machine Learning, and the Conference on Uncertainty in Artificial Intelligence. He has served on the Board of Directors of the ACM Special Interest Group on Knowledge Discovery and Data Mining (2005-2013), the Defense Science Study Group (2006-2007), and DARPA's Information Science and Technology Group (2007-2012). In 2011, he received the Outstanding Teacher Award from the UMass College of Natural Sciences.

More from the Same Authors