Timezone: »
Stack-augmented recurrent neural networks (RNNs) have been of interest to the deep learning community for some time. However, the difficulty of training memory models remains a problem obstructing the widespread use of such models. In this paper, we propose the Ordered Memory architecture. Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory. We also introduce a new Gated Recursive Cell to compose lower-level representations into higher-level representation. We demonstrate that our model achieves strong performance on the logical inference task (Bowman et al., 2015) and the ListOps (Nangia and Bowman, 2018) task. We can also interpret the model to retrieve the induced tree structure, and find that these induced structures align with the ground truth. Finally, we evaluate our model on the Stanford Sentiment Treebank tasks (Socher et al., 2013), and find that it performs comparatively with the state-of-the-art methods in the literature.
Author Information
Yikang Shen (Mila, University of Montreal, MSR Montreal)
Shawn Tan (Mila)
Arian Hosseini (Mila, University of Montreal, MSR Montreal)
Zhouhan Lin (MILA)
Alessandro Sordoni (Microsoft Research)
Aaron Courville (U. Montreal)
More from the Same Authors
-
2021 Spotlight: A Variational Perspective on Diffusion-Based Generative Models and Score Matching »
Chin-Wei Huang · Jae Hyun Lim · Aaron Courville -
2021 : DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Tengyu Ma · Aaron Courville · George Tucker · Sergey Levine -
2021 : Behavior Predictive Representations for Generalization in Reinforcement Learning »
Siddhant Agarwal · Aaron Courville · Rishabh Agarwal -
2021 : MIDI-DDSP: Hierarchical Modeling of Music for Detailed Control »
Yusong Wu · Ethan Manilow · Kyle Kastner · Tim Cooijmans · Aaron Courville · Cheng-Zhi Anna Huang · Jesse Engel -
2022 : Planning with Large Language Models for Code Generation »
Shun Zhang · Zhenfang Chen · Yikang Shen · Mingyu Ding · Josh Tenenbaum · Chuang Gan -
2022 : Hyper-Decision Transformer for Efficient Online Policy Adaptation »
Mengdi Xu · Yuchen Lu · Yikang Shen · Shun Zhang · DING ZHAO · Chuang Gan -
2022 : Datasets That Are Not: Evolving Novelty Through Sparsity and Iterated Learning »
Yusong Wu · Kyle Kastner · Tim Cooijmans · Cheng-Zhi Anna Huang · Aaron Courville -
2022 : Unleashing The Potential of Data Sharing in Ensemble Deep Reinforcement Learning »
Zhixuan Lin · Pierluca D'Oro · Evgenii Nikishin · Aaron Courville -
2022 : Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier »
Pierluca D'Oro · Max Schwarzer · Evgenii Nikishin · Pierre-Luc Bacon · Marc Bellemare · Aaron Courville -
2022 : Investigating Multi-task Pretraining and Generalization in Reinforcement Learning »
Adrien Ali Taiga · Rishabh Agarwal · Jesse Farebrother · Aaron Courville · Marc Bellemare -
2022 Poster: Riemannian Diffusion Models »
Chin-Wei Huang · Milad Aghajohari · Joey Bose · Prakash Panangaden · Aaron Courville -
2022 Poster: Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress »
Rishabh Agarwal · Max Schwarzer · Pablo Samuel Castro · Aaron Courville · Marc Bellemare -
2021 : Behavior Predictive Representations for Generalization in Reinforcement Learning »
Siddhant Agarwal · Aaron Courville · Rishabh Agarwal -
2021 Workshop: Advances in Programming Languages and Neurosymbolic Systems (AIPLANS) »
Breandan Considine · Disha Shrivastava · David Yu-Tung Hui · Chin-Wei Huang · Shawn Tan · Xujie Si · Prakash Panangaden · Guy Van den Broeck · Daniel Tarlow -
2021 : DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Q&A »
Aviral Kumar · Rishabh Agarwal · Tengyu Ma · Aaron Courville · George Tucker · Sergey Levine -
2021 : DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Tengyu Ma · Aaron Courville · George Tucker · Sergey Levine -
2021 Poster: Gradient Starvation: A Learning Proclivity in Neural Networks »
Mohammad Pezeshki · Oumar Kaba · Yoshua Bengio · Aaron Courville · Doina Precup · Guillaume Lajoie -
2021 Poster: Pretraining Representations for Data-Efficient Reinforcement Learning »
Max Schwarzer · Nitarshan Rajkumar · Michael Noukhovitch · Ankesh Anand · Laurent Charlin · R Devon Hjelm · Philip Bachman · Aaron Courville -
2021 Poster: A Variational Perspective on Diffusion-Based Generative Models and Score Matching »
Chin-Wei Huang · Jae Hyun Lim · Aaron Courville -
2021 Oral: Deep Reinforcement Learning at the Edge of the Statistical Precipice »
Rishabh Agarwal · Max Schwarzer · Pablo Samuel Castro · Aaron Courville · Marc Bellemare -
2021 Poster: Self-Instantiated Recurrent Units with Dynamic Soft Recursion »
Aston Zhang · Yi Tay · Yikang Shen · Alvin Chan · SHUAI ZHANG -
2021 Poster: Deep Reinforcement Learning at the Edge of the Statistical Precipice »
Rishabh Agarwal · Max Schwarzer · Pablo Samuel Castro · Aaron Courville · Marc Bellemare -
2020 Workshop: AI for Earth Sciences »
Surya Karthik Mukkavilli · Johanna Hansen · Natasha Dudek · Tom Beucler · Kelly Kochanski · Mayur Mudigonda · Karthik Kashinath · Amy McGovern · Paul D Miller · Chad Frischmann · Pierre Gentine · Gregory Dudek · Aaron Courville · Daniel Kammen · Vipin Kumar -
2020 Poster: Unsupervised Learning of Dense Visual Representations »
Pedro O. Pinheiro · Amjad Almahairi · Ryan Benmalek · Florian Golemo · Aaron Courville -
2019 Poster: MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis »
Kundan Kumar · Rithesh Kumar · Thibault de Boissiere · Lucas Gestin · Wei Zhen Teoh · Jose Sotelo · Alexandre de BrĂ©bisson · Yoshua Bengio · Aaron Courville -
2019 Poster: No-Press Diplomacy: Modeling Multi-Agent Gameplay »
Philip Paquette · Yuchen Lu · SETON STEVEN BOCCO · Max Smith · Satya O.-G. · Jonathan K. Kummerfeld · Joelle Pineau · Satinder Singh · Aaron Courville -
2018 Workshop: Visually grounded interaction and language »
Florian Strub · Harm de Vries · Erik Wijmans · Samyak Datta · Ethan Perez · Mateusz Malinowski · Stefan Lee · Peter Anderson · Aaron Courville · Jeremie MARY · Dhruv Batra · Devi Parikh · Olivier Pietquin · Chiori HORI · Tim Marks · Anoop Cherian -
2018 Poster: Improving Explorability in Variational Inference with Annealed Variational Objectives »
Chin-Wei Huang · Shawn Tan · Alexandre Lacoste · Aaron Courville -
2018 Poster: Towards Text Generation with Adversarially Learned Neural Outlines »
Sandeep Subramanian · Sai Rajeswar Mudumba · Alessandro Sordoni · Adam Trischler · Aaron Courville · Chris Pal -
2017 Workshop: Visually grounded interaction and language »
Florian Strub · Harm de Vries · Abhishek Das · Satwik Kottur · Stefan Lee · Mateusz Malinowski · Olivier Pietquin · Devi Parikh · Dhruv Batra · Aaron Courville · Jeremie Mary -
2017 Poster: Improved Training of Wasserstein GANs »
Ishaan Gulrajani · Faruk Ahmed · Martin Arjovsky · Vincent Dumoulin · Aaron Courville -
2017 Demonstration: A Deep Reinforcement Learning Chatbot »
Iulian Vlad Serban · Chinnadhurai Sankar · Mathieu Germain · Saizheng Zhang · Zhouhan Lin · Sandeep Subramanian · Taesup Kim · Michael Pieper · Sarath Chandar · Nan Rosemary Ke · Sai Rajeswar Mudumba · Alexandre de BrĂ©bisson · Jose Sotelo · Dendi A Suhubdy · Vincent Michalski · Joelle Pineau · Yoshua Bengio -
2017 Poster: GibbsNet: Iterative Adversarial Inference for Deep Graphical Models »
Alex Lamb · R Devon Hjelm · Yaroslav Ganin · Joseph Paul Cohen · Aaron Courville · Yoshua Bengio -
2017 Poster: Modulating early visual processing by language »
Harm de Vries · Florian Strub · Jeremie Mary · Hugo Larochelle · Olivier Pietquin · Aaron Courville -
2017 Spotlight: Modulating early visual processing by language »
Harm de Vries · Florian Strub · Jeremie Mary · Hugo Larochelle · Olivier Pietquin · Aaron Courville -
2016 : Discussion panel »
Ian Goodfellow · Soumith Chintala · Arthur Gretton · Sebastian Nowozin · Aaron Courville · Yann LeCun · Emily Denton -
2016 : Adversarially Learned Inference (ALI) and BiGANs »
Aaron Courville -
2016 Poster: Architectural Complexity Measures of Recurrent Neural Networks »
Saizheng Zhang · Yuhuai Wu · Tong Che · Zhouhan Lin · Roland Memisevic · Russ Salakhutdinov · Yoshua Bengio -
2016 Poster: Professor Forcing: A New Algorithm for Training Recurrent Networks »
Alex M Lamb · Anirudh Goyal · Ying Zhang · Saizheng Zhang · Aaron Courville · Yoshua Bengio -
2015 : Introduction »
Aaron Courville -
2015 Workshop: Multimodal Machine Learning »
Louis-Philippe Morency · Tadas Baltrusaitis · Aaron Courville · Kyunghyun Cho -
2015 Poster: A Recurrent Latent Variable Model for Sequential Data »
Junyoung Chung · Kyle Kastner · Laurent Dinh · Kratarth Goel · Aaron Courville · Yoshua Bengio -
2014 Poster: Generative Adversarial Nets »
Ian Goodfellow · Jean Pouget-Abadie · Mehdi Mirza · Bing Xu · David Warde-Farley · Sherjil Ozair · Aaron Courville · Yoshua Bengio -
2013 Poster: Multi-Prediction Deep Boltzmann Machines »
Ian Goodfellow · Mehdi Mirza · Aaron Courville · Yoshua Bengio -
2011 Poster: On Tracking The Partition Function »
Guillaume Desjardins · Aaron Courville · Yoshua Bengio -
2009 Poster: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism »
Aaron Courville · Douglas Eck · Yoshua Bengio -
2009 Session: Oral Session 3: Deep Learning and Network Models »
Aaron Courville -
2008 Session: Oral session 11: Attention and Mind »
Aaron Courville -
2007 Spotlight: The rat as particle filter »
Nathaniel D Daw · Aaron Courville -
2007 Poster: The rat as particle filter »
Nathaniel D Daw · Aaron Courville