Timezone: »
We consider the problem of learning rules from natural language text sources. These sources, such as news articles and web texts, are created by a writer to communicate information to a reader, where the writer and reader share substantial domain knowledge. Consequently, the texts tend to be concise and mention the minimum information necessary for the reader to draw the correct conclusions. We study the problem of learning domain knowledge from such concise texts, which is an instance of the general problem of learning in the presence of missing data. However, unlike standard approaches to missing data, in this setting we know that facts are more likely to be missing from the text in cases where the reader can infer them from the facts that are mentioned combined with the domain knowledge. Hence, we can explicitly model this ""missingness"" process and invert it via probabilistic inference to learn the underlying domain knowledge. This paper introduces a mention model that models the probability of facts being mentioned in the text based on what other facts have already been mentioned and domain knowledge in the form of Horn clause rules. Learning must simultaneously search the space of rules and learn the parameters of the mention model. We accomplish this via an application of Expectation Maximization within a Markov Logic framework. An experimental evaluation on synthetic and natural text data shows that the method can learn accurate rules and apply them to new texts to make correct inferences. Experiments also show that the method out-performs the standard EM approach that assumes mentions are missing at random.
Author Information
M. Shahed Sorower (Capital One Labs)
Thomas Dietterich (Oregon State University)
Tom Dietterich (AB Oberlin College 1977; MS University of Illinois 1979; PhD Stanford University 1984) is Professor and Director of Intelligent Systems Research at Oregon State University. Among his contributions to machine learning research are (a) the formalization of the multiple-instance problem, (b) the development of the error-correcting output coding method for multi-class prediction, (c) methods for ensemble learning, (d) the development of the MAXQ framework for hierarchical reinforcement learning, and (e) the application of gradient tree boosting to problems of structured prediction and latent variable models. Dietterich has pursued application-driven fundamental research in many areas including drug discovery, computer vision, computational sustainability, and intelligent user interfaces. Dietterich has served the machine learning community in a variety of roles including Executive Editor of the Machine Learning journal, co-founder of the Journal of Machine Learning Research, editor of the MIT Press Book Series on Adaptive Computation and Machine Learning, and editor of the Morgan-Claypool Synthesis series on Artificial Intelligence and Machine Learning. He was Program Co-Chair of AAAI-1990, Program Chair of NIPS-2000, and General Chair of NIPS-2001. He was first President of the International Machine Learning Society (the parent organization of ICML) and served a term on the NIPS Board of Trustees and the Council of AAAI.
Janardhan Rao Doppa (Oregon State University)
Walker Orr (Oregon State University)
Prasad Tadepalli (Oregon State University)
Xiaoli Fern (Oregon State University)
More from the Same Authors
-
2021 Spotlight: Optimal Policies Tend To Seek Power »
Alex Turner · Logan Smith · Rohin Shah · Andrew Critch · Prasad Tadepalli -
2021 : Deep RePReL--Combining Planning and Deep RL for acting in relational domains »
Harsha Kokel · Arjun Manoharan · Sriraam Natarajan · Balaraman Ravindran · Prasad Tadepalli -
2022 Poster: Parametrically Retargetable Decision-Makers Tend To Seek Power »
Alex Turner · Prasad Tadepalli -
2021 Poster: One Explanation is Not Enough: Structured Attention Graphs for Image Classification »
Vivswan Shitole · Fuxin Li · Minsuk Kahng · Prasad Tadepalli · Alan Fern -
2021 Poster: Optimal Policies Tend To Seek Power »
Alex Turner · Logan Smith · Rohin Shah · Andrew Critch · Prasad Tadepalli -
2020 : Mini-panel discussion 3 - Prioritizing Real World RL Challenges »
Chelsea Finn · Thomas Dietterich · Angela Schoellig · Anca Dragan · Anusha Nagabandi · Doina Precup -
2020 : Keynote: Tom Diettrich »
Thomas Dietterich -
2020 Poster: Avoiding Side Effects in Complex Environments »
Alex Turner · Neale Ratzlaff · Prasad Tadepalli -
2020 Spotlight: Avoiding Side Effects in Complex Environments »
Alex Turner · Neale Ratzlaff · Prasad Tadepalli -
2019 : AI and Sustainable Development »
Fei Fang · Carla Gomes · Miguel Luengo-Oroz · Thomas Dietterich · Julien Cornebise -
2019 : Automated Quality Control for a Weather Sensor Network »
Thomas Dietterich -
2016 : Automated Data Cleaning via Multi-View Anomaly Detection »
Thomas Dietterich -
2014 Workshop: 3rd NIPS Workshop on Probabilistic Programming »
Daniel Roy · Josh Tenenbaum · Thomas Dietterich · Stuart J Russell · YI WU · Ulrik R Beierholm · Alp Kucukelbir · Zenna Tavares · Yura Perov · Daniel Lee · Brian Ruttenberg · Sameer Singh · Michael Hughes · Marco Gaboardi · Alexey Radul · Vikash Mansinghka · Frank Wood · Sebastian Riedel · Prakash Panangaden -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar -
2013 Workshop: Machine Learning for Sustainability »
Edwin Bonilla · Thomas Dietterich · Theodoros Damoulas · Andreas Krause · Daniel Sheldon · Iadine Chades · J. Zico Kolter · Bistra Dilkina · Carla Gomes · Hugo P Simao -
2013 Poster: Symbolic Opportunistic Policy Iteration for Factored-Action MDPs »
Aswin Raghavan · Roni Khardon · Alan Fern · Prasad Tadepalli -
2012 Workshop: Human Computation for Science and Computational Sustainability »
Theodoros Damoulas · Thomas Dietterich · Edith Law · Serge Belongie -
2012 Poster: A Bayesian Approach for Policy Learning from Trajectory Preference Queries »
Aaron Wilson · Alan Fern · Prasad Tadepalli -
2012 Poster: Probabilistic Topic Coding for Superset Label Learning »
Liping Liu · Thomas Dietterich -
2012 Invited Talk: Challenges for Machine Learning in Computational Sustainability »
Thomas Dietterich -
2011 Workshop: Machine Learning for Sustainability »
Thomas Dietterich · J. Zico Kolter · Matthew A Brown -
2011 Poster: Budgeted Optimization with Concurrent Stochastic-Duration Experiments »
Javad Azimi · Alan Fern · Xiaoli Fern -
2011 Spotlight: Budgeted Optimization with Concurrent Stochastic-Duration Experiments »
Javad Azimi · Alan Fern · Xiaoli Fern -
2011 Poster: Autonomous Learning of Action Models for Planning »
Neville Mehta · Prasad Tadepalli · Alan Fern -
2011 Poster: Collective Graphical Models »
Daniel Sheldon · Thomas Dietterich -
2010 Poster: A Computational Decision Theory for Interactive Assistants »
Alan Fern · Prasad Tadepalli -
2010 Poster: Batch Bayesian Optimization via Simulation Matching »
Javad Azimi · Alan Fern · Xiaoli Fern -
2009 Mini Symposium: Machine Learning for Sustainability »
J. Zico Kolter · Thomas Dietterich · Andrew Y Ng