Timezone: »

NIPS 2013 Workshop on Causality: Large-scale Experiment Design and Inference of Causal Mechanisms
Isabelle Guyon · Leon Bottou · Bernhard Schölkopf · Alexander Statnikov · Evelyne Viegas · james m robins

Mon Dec 09 07:30 AM -- 06:30 PM (PST) @ Harrah's Glenbrook+Emerald
Event URL: http://clopinet.com/isabelle/Projects/NIPS2013/ »

The goal of this workshop is to discuss new methods of large scale experiment design and their application to the inference of causal mechanisms and promote their evaluation via a series of challenges. Emphasis will be put on capitalizing on massive amounts of available observational data to cut down the number of experiments needed, pseudo- or quasi-experiments, iterative designs, and the on-line acquisition of data with minimal perturbation of the system under study. The participants of the cause-effect pairs challenge http://www.causality.inf.ethz.ch/cause-effect.php will be encouraged to submit papers.

The problem of attributing causes to effects is pervasive in science, medicine, economy and almost every aspects of our everyday life involving human reasoning and decision making. What affects your health? the economy? climate changes? The gold standard to establish causal relationships is to perform randomized controlled experiments. However, experiments are costly while non-experimental "observational" data collected routinely around the world are readily available. Unraveling potential cause-effect relationships from such observational data could save a lot of time and effort by allowing us to prioritize confirmatory experiments. This could be complemented by new strategies of incremental experimental design combining observational and experimental data.

Much of machine learning has been so far concentrating on analyzing data already collected, rather than collecting data. While experimental design is a well-developed discipline of statistics, data collection practitioners often neglect to apply its principled methods. As a result, data collected and made available to data analysts, in charge of explaining them and building predictive or causal models, are not always of good quality and are plagued by experimental artifacts. In reaction to this situation, some researchers in machine learning have started to become interested in experimental design to close the gap between data acquisition or experimentation and model building. In parallel, researchers in causal studies have started raising the awareness of the differences between passive observations, active sampling, and interventions. In this domain, only interventions qualify as true experiments capable of unraveling cause-effect relationships

This workshop will discuss methods of experimental design, which involve machine learning in the process of data collection. Experiments require intervening on the system under study, which is usually expensive and sometimes unethical or impossible. Changing the course of the planets to study the tides is impossible, forcing people to smoke to study the influence of smoking on health is unethical, modifying the placement of ads on web pages to optimize revenue may be expensive. In the latter case, recent methods proposed by Léon Bottou and others involve minimally perturbating the process with small random interventions to collect interventional data around the operating point and extrapolate to estimate the effect of various interventions. Presently, there is a profusion of other algorithms being proposed, mostly evaluated on toy problems. One of the main challenges in causal learning consists in developing strategies for an objective evaluation. This includes, for instance, methods how to acquire large representative data sets with known ground truth. This, in turn, raises the question to what extent the regularities observed in these data sets also apply to the relevant data sets where the causal structure is unknown because data sets with known ground truth may not be representative.

As part of an on-going effort of benchmarking causal discovery methods, we organized a new challenge [March 28 - September 2, 2013] whose purpose is to devise a "coefficient of causation": given samples of a pair of variables, compute a coefficient between -Inf and +Inf, large positive values indicating that A causes B, small negative values that B causes A and values near zero indicating no causal relationship.
We provided hundreds of pairs of real variables with known causal relationships from domains as diverse as chemistry, climatology, ecology, economy, engineering, epidemiology, genomics, medicine, physics. and sociology. Those are intermixed with controls (pairs of independent variables and pairs of variables that are dependent but not causally related) and semi-artificial cause-effect pairs (real variables mixed in various ways to produce a given outcome). This challenge is limited to pairs of variables deprived of their context. Thus constraint-based methods relying on conditional independence tests and/or graphical models are not applicable. The goal is to push the state-of-the art in complementary methods, which can eventually disambiguate Markov equivalence classes.
We are also planning to run in October-November 2013 a second edition of the cause-effect pairs challenge dedicated to attract students who want to learn about the problem and build on top of the best challenge submission. This event will be sponsored in part by Microsoft and serve to beta test CodaLab a new machine learning experimentation platform, which will be launched in 2014.

Part of the workshop will be devoted to discuss the results of the challenge and to plan for future events, which may include a causality in time series challenge and a series of challenge on experimental design in which the participants can conduct virtual experiments on artificial systems. The workshop will bring together researchers in machine learning and statistics and application domains including computational biology and econometrics.

Author Information

Isabelle Guyon (U. Paris-Saclay & ChaLearn)

Isabelle Guyon recently joined Google Brain as a research scientist. She is also professor of artificial intelligence at Université Paris-Saclay (Orsay). Her areas of expertise include computer vision, bioinformatics, and power systems. She is best known for being a co-inventor of Support Vector Machines. Her recent interests are in automated machine learning, meta-learning, and data-centric AI. She has been a strong promoter of challenges and benchmarks, and is president of ChaLearn, a non-profit dedicated to organizing machine learning challenges. She is community lead of Codalab competitions, a challenge platform used both in academia and industry. She co-organized the “Challenges in Machine Learning Workshop” @ NeurIPS between 2014 and 2019, launched the "NeurIPS challenge track" in 2017 while she was general chair, and pushed the creation of the "NeurIPS datasets and benchmark track" in 2021, as a NeurIPS board member.

Leon Bottou (Facebook AI Research)

Léon Bottou received a Diplôme from l'Ecole Polytechnique, Paris in 1987, a Magistère en Mathématiques Fondamentales et Appliquées et Informatiques from Ecole Normale Supérieure, Paris in 1988, and a PhD in Computer Science from Université de Paris-Sud in 1991. He joined AT&T Bell Labs from 1991 to 1992 and AT&T Labs from 1995 to 2002. Between 1992 and 1995 he was chairman of Neuristique in Paris, a small company pioneering machine learning for data mining applications. He has been with NEC Labs America in Princeton since 2002. Léon's primary research interest is machine learning. His contributions to this field address theory, algorithms and large scale applications. Léon's secondary research interest is data compression and coding. His best known contribution in this field is the DjVu document compression technology (http://www.djvu.org.) Léon published over 70 papers and is serving on the boards of JMLR and IEEE TPAMI. He also serves on the scientific advisory board of Kxen Inc .

Bernhard Schölkopf (MPI for Intelligent Systems)

Bernhard Scholkopf received degrees in mathematics (London) and physics (Tubingen), and a doctorate in computer science from the Technical University Berlin. He has researched at AT&T Bell Labs, at GMD FIRST, Berlin, at the Australian National University, Canberra, and at Microsoft Research Cambridge (UK). In 2001, he was appointed scientific member of the Max Planck Society and director at the MPI for Biological Cybernetics; in 2010 he founded the Max Planck Institute for Intelligent Systems. For further information, see www.kyb.tuebingen.mpg.de/~bs.

Alexander Statnikov (New York University)
Evelyne Viegas (Microsoft Research)
james m robins (Harvard University)

More from the Same Authors