Timezone: »

A Scalable Hierarchical Distributed Language Model
Andriy Mnih · Geoffrey E Hinton

Tue Dec 09 07:30 PM -- 12:00 AM (PST) @

Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models. The main drawback of NPLMs is their extremely long training and testing times. Morin and Bengio have proposed a hierarchical language model built around a binary tree of words that was two orders of magnitude faster than the non-hierarchical language model it was based on. However, it performed considerably worse than its non-hierarchical counterpart in spite of using a word tree created using expert knowledge. We introduce a fast hierarchical language model along with a simple feature-based algorithm for automatic construction of word trees from the data. We then show that the resulting models can outperform non-hierarchical models and achieve state-of-the-art performance.

Author Information

Andriy Mnih (DeepMind)
Geoffrey E Hinton (Google & University of Toronto)

Geoffrey Hinton received his PhD in Artificial Intelligence from Edinburgh in 1978 and spent five years as a faculty member at Carnegie-Mellon where he pioneered back-propagation, Boltzmann machines and distributed representations of words. In 1987 he became a fellow of the Canadian Institute for Advanced Research and moved to the University of Toronto. In 1998 he founded the Gatsby Computational Neuroscience Unit at University College London, returning to the University of Toronto in 2001. His group at the University of Toronto then used deep learning to change the way speech recognition and object recognition are done. He currently splits his time between the University of Toronto and Google. In 2010 he received the NSERC Herzberg Gold Medal, Canada's top award in Science and Engineering.

More from the Same Authors