Timezone: »
Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists of maximizing the likelihood of each token in the sequence given the current (recurrent) state and the previous token. At inference, the unknown previous token is then replaced by a token generated by the model itself. This discrepancy between training and inference can yield errors that can accumulate quickly along the generated sequence. We propose a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided scheme which mostly uses the generated token instead. Experiments on several sequence prediction tasks show that this approach yields significant improvements. Moreover, it was used successfully in our winning bid to the MSCOCO image captioning challenge, 2015.
Author Information
Samy Bengio (Google Research)
Oriol Vinyals (Google)
Oriol Vinyals is a Research Scientist at Google. He works in deep learning with the Google Brain team. Oriol holds a Ph.D. in EECS from University of California, Berkeley, and a Masters degree from University of California, San Diego. He is a recipient of the 2011 Microsoft Research PhD Fellowship. He was an early adopter of the new deep learning wave at Berkeley, and in his thesis he focused on non-convex optimization and recurrent neural networks. At Google Brain he continues working on his areas of interest, which include artificial intelligence, with particular emphasis on machine learning, language, and vision.
Navdeep Jaitly (Google)
Noam Shazeer (Google)
More from the Same Authors
-
2020 Poster: Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards »
Yijie Guo · Jongwook Choi · Marcin Moczulski · Shengyu Feng · Samy Bengio · Mohammad Norouzi · Honglak Lee -
2020 Poster: Pointer Graph Networks »
Petar Veličković · Lars Buesing · Matthew Overlan · Razvan Pascanu · Oriol Vinyals · Charles Blundell -
2020 Spotlight: Pointer Graph Networks »
Petar Veličković · Lars Buesing · Matthew Overlan · Razvan Pascanu · Oriol Vinyals · Charles Blundell -
2020 Session: Orals & Spotlights Track 28: Deep Learning »
Oriol Vinyals · Guido Montufar -
2019 Poster: Transfusion: Understanding Transfer Learning for Medical Imaging »
Maithra Raghu · Chiyuan Zhang · Jon Kleinberg · Samy Bengio -
2019 Poster: Generating Diverse High-Fidelity Images with VQ-VAE-2 »
Ali Razavi · Aaron van den Oord · Oriol Vinyals -
2019 Poster: Classification Accuracy Score for Conditional Generative Models »
Suman Ravuri · Oriol Vinyals -
2018 Poster: Large Margin Deep Networks for Classification »
Gamaleldin Elsayed · Dilip Krishnan · Hossein Mobahi · Kevin Regan · Samy Bengio -
2018 Poster: Insights on representational similarity in neural networks with canonical correlation »
Ari Morcos · Maithra Raghu · Samy Bengio -
2018 Poster: Blockwise Parallel Decoding for Deep Autoregressive Models »
Mitchell Stern · Noam Shazeer · Jakob Uszkoreit -
2018 Poster: Content preserving text generation with attribute controls »
Lajanugen Logeswaran · Honglak Lee · Samy Bengio -
2018 Poster: Relational recurrent neural networks »
Adam Santoro · Ryan Faulkner · David Raposo · Jack Rae · Mike Chrzanowski · Theophane Weber · Daan Wierstra · Oriol Vinyals · Razvan Pascanu · Timothy Lillicrap -
2018 Poster: Mesh-TensorFlow: Deep Learning for Supercomputers »
Noam Shazeer · Youlong Cheng · Niki Parmar · Dustin Tran · Ashish Vaswani · Penporn Koanantakool · Peter Hawkins · HyoukJoong Lee · Mingsheng Hong · Cliff Young · Ryan Sepassi · Blake Hechtman -
2017 Workshop: Deep Learning: Bridging Theory and Practice »
Sanjeev Arora · Maithra Raghu · Russ Salakhutdinov · Ludwig Schmidt · Oriol Vinyals -
2017 Poster: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Poster: Attention is All you Need »
Ashish Vaswani · Noam Shazeer · Niki Parmar · Jakob Uszkoreit · Llion Jones · Aidan Gomez · Łukasz Kaiser · Illia Polosukhin -
2017 Spotlight: Attention is All you Need »
Ashish Vaswani · Noam Shazeer · Niki Parmar · Jakob Uszkoreit · Llion Jones · Aidan Gomez · Łukasz Kaiser · Illia Polosukhin -
2017 Oral: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Poster: Neural Discrete Representation Learning »
Aaron van den Oord · Oriol Vinyals · koray kavukcuoglu -
2017 Tutorial: Deep Learning: Practice and Trends »
Nando de Freitas · Scott Reed · Oriol Vinyals -
2016 Workshop: Extreme Classification: Multi-class and Multi-label Learning in Extremely Large Label Spaces »
Moustapha Cisse · Manik Varma · Samy Bengio -
2016 Symposium: Deep Learning Symposium »
Yoshua Bengio · Yann LeCun · Navdeep Jaitly · Roger Grosse -
2016 Poster: Conditional Image Generation with PixelCNN Decoders »
Aaron van den Oord · Nal Kalchbrenner · Lasse Espeholt · koray kavukcuoglu · Oriol Vinyals · Alex Graves -
2016 Poster: Can Active Memory Replace Attention? »
Łukasz Kaiser · Samy Bengio -
2016 Poster: An Online Sequence-to-Sequence Model Using Partial Conditioning »
Navdeep Jaitly · Quoc V Le · Oriol Vinyals · Ilya Sutskever · David Sussillo · Samy Bengio -
2016 Poster: Reward Augmented Maximum Likelihood for Neural Structured Prediction »
Mohammad Norouzi · Samy Bengio · zhifeng Chen · Navdeep Jaitly · Mike Schuster · Yonghui Wu · Dale Schuurmans -
2016 Poster: Strategic Attentive Writer for Learning Macro-Actions »
Alexander (Sasha) Vezhnevets · Volodymyr Mnih · Simon Osindero · Alex Graves · Oriol Vinyals · John Agapiou · koray kavukcuoglu -
2016 Poster: Matching Networks for One Shot Learning »
Oriol Vinyals · Charles Blundell · Timothy Lillicrap · koray kavukcuoglu · Daan Wierstra -
2015 Poster: Pointer Networks »
Oriol Vinyals · Meire Fortunato · Navdeep Jaitly -
2015 Spotlight: Pointer Networks »
Oriol Vinyals · Meire Fortunato · Navdeep Jaitly -
2015 Poster: Grammar as a Foreign Language »
Oriol Vinyals · Łukasz Kaiser · Terry Koo · Slav Petrov · Ilya Sutskever · Geoffrey Hinton -
2015 Tutorial: Large-Scale Distributed Systems for Training Neural Networks »
Jeff Dean · Oriol Vinyals -
2013 Workshop: Output Representation Learning »
Yuhong Guo · Dale Schuurmans · Richard Zemel · Samy Bengio · Yoshua Bengio · Li Deng · Dan Roth · Kilian Q Weinberger · Jason Weston · Kihyuk Sohn · Florent Perronnin · Gabriel Synnaeve · Pablo R Strasser · julien audiffren · Carlo Ciliberto · Dan Goldwasser -
2013 Poster: DeViSE: A Deep Visual-Semantic Embedding Model »
Andrea Frome · Greg Corrado · Jon Shlens · Samy Bengio · Jeff Dean · Marc'Aurelio Ranzato · Tomas Mikolov -
2012 Workshop: Big Data Meets Computer Vision: First International Workshop on Large Scale Visual Recognition and Retrieval »
Jia Deng · Samy Bengio · Yuanqing Lin · Li Fei-Fei -
2010 Poster: Label Embedding Trees for Large Multi-Class Tasks »
Samy Bengio · Jason E Weston · David Grangier -
2009 Poster: Group Sparse Coding »
Samy Bengio · Fernando Pereira · Yoram Singer · Dennis Strelow -
2009 Poster: An Online Algorithm for Large Scale Image Similarity Learning »
Gal Chechik · Uri Shalit · Varun Sharma · Samy Bengio -
2007 Workshop: Efficient Machine Learning - Overcoming Computational Bottlenecks in Machine Learning (Part 2) »
Samy Bengio · Corinna Cortes · Dennis DeCoste · Francois Fleuret · Ramesh Natarajan · Edwin Pednault · Dan Pelleg · Elad Yom-Tov -
2007 Workshop: Efficient Machine Learning - Overcoming Computational Bottlenecks in Machine Learning (Part 1) »
Samy Bengio · Corinna Cortes · Dennis DeCoste · Francois Fleuret · Ramesh Natarajan · Edwin Pednault · Dan Pelleg · Elad Yom-Tov -
2006 Workshop: Learning to Compare Examples »
David Grangier · Samy Bengio