Timezone: »
Text understanding is an old yet-unsolved AI problem consisting of a number of nontrivial steps. The critical step in solving the problem is knowledge acquisition from text, i.e. a transition from a non-formalized text into a formalized actionable language (i.e. capable of reasoning). Other steps in the text understanding pipeline include linguistic processing, reasoning, text generation, search, question answering etc. which are more or less solved to the degree which allows composition of a text understanding service. On the other hand, we know that knowledge acquisition, as the key bottleneck, can be done by humans, while automating of the process is still out of reach in its full breadth.
After failed attempts in the past (due to a lack of theoretical and technological prerequisites), in the recent years the interest for the text understanding and knowledge acquisition form text is growing. There is a number of AI research groups dealing with the various aspects in the areas of computational linguistics, machine learning, probabilistic & logical reasoning, and semantic web. The commonality among all the newer approaches is the use of machine learning to deal with representational change. To list some of the groups working in the area:
• Carnegie Mellon University (Never-Ending Language Learning: http://rtw.ml.cmu.edu/rtw/)
• Cycorp (Semantic Construction Grammar: http://www.cyc.com/)
• IBM Research (Watson project: http://www.ibm.com/watson)
• IDIAP Research Institute (Deep Learning for NLP: http://publications.idiap.ch/index.php/authors/show/336)
• Jozef Stefan Institute (Cross-Lingual Knowledge-Extraction: http://xlike.org)
• KU Leuven (Spatial Role Labelling via Machine Learning for SEMEVAL)
• Max Planck Institut (YAGO project: http://www.mpi-inf.mpg.de/yago-naga/yago/)
• MIT Media Lab (ConceptNet: http://conceptnet5.media.mit.edu/)
• University Washington (Open Information Extraction: http://openie.cs.washington.edu/)
• Vulcan Inc. (Semantic Inferencing on Large Knowledge: http://silk.semwebcentral.org/)
Apart from the above projects, there is noticeable increase of interest in the technology companies (such as Google, Microsoft, IBM) as well as big publishers (such as NYTimes, BBC, Bloomberg) to employ semantic technologies into their services leading towards understanding unstructured data beyond shallow, representation poor Text-Mining and Information-Retrieval techniques.
Workshop objective: Since all of the above listed attempts use extensively machine learning and probabilistic approaches, the goal of the workshop is to collect key researchers and practitioners from the area to exchange ideas, approaches and techniques used to deal with text understanding and related knowledge acquisition problems.
Author Information
Marko Grobelnik (Jozef Stefan Institute)
Blaz Fortuna (Jozef Stefan Institute)
Estevam Hruschka (Amazon)
Michael J Witbrock (Cycorp Inc)
More from the Same Authors
-
2017 Workshop: Workshop on Prioritising Online Content »
John Shawe-Taylor · Massimiliano Pontil · Nicolò Cesa-Bianchi · Emine Yilmaz · Chris Watkins · Sebastian Riedel · Marko Grobelnik -
2016 : Extracting Templates from Media Event Sequences »
Marko Grobelnik -
2013 Demonstration: Cross-Lingual Technologies: Text to Logic Mapping, Search and Classification over 100 Languages »
Jan Rupnik · Andrej Muhic · Blaz Fortuna · Janez Starc · Marko Grobelnik · Michael J Witbrock -
2013 Demonstration: Semi-supervised learning for multilingual text to logic mapping »
Janez Starc · Marko Grobelnik · Michael J Witbrock -
2012 Workshop: xLiTe: Cross-Lingual Technologies »
Achim Rettinger · Marko Grobelnik · Blaz Fortuna · Xavier Carreras · Juanzi Li -
2006 Demonstration: OntoGen »
Blaž Fortuna · Dunja Mladenic · Marko Grobelnik