Timezone: »
Playing text-based games requires skills in processing natural language and sequential decision making. Achieving human-level performance on text-based games remains an open challenge, and prior research has largely relied on hand-crafted structured representations and heuristics. In this work, we investigate how an agent can plan and generalize in text-based games using graph-structured representations learned end-to-end from raw text. We propose a novel graph-aided transformer agent (GATA) that infers and updates latent belief graphs during planning to enable effective action selection by capturing the underlying game dynamics. GATA is trained using a combination of reinforcement and self-supervised learning. Our work demonstrates that the learned graph-based representations help agents converge to better policies than their text-only counterparts and facilitate effective generalization across game configurations. Experiments on 500+ unique games from the TextWorld suite show that our best agent outperforms text-based baselines by an average of 24.2%.
Author Information
Ashutosh Adhikari (University of Waterloo)
Xingdi Yuan (Microsoft Research)
Marc-Alexandre Côté (Microsoft Research)
Mikuláš Zelinka (Charles University, Faculty of Mathematics and Physics)
Marc-Antoine Rondeau (Microsoft Research)
Romain Laroche (Microsoft Research)
Pascal Poupart (University of Waterloo & Vector Institute)
Jian Tang (Mila)
Adam Trischler (Microsoft)
Will Hamilton (McGill)
More from the Same Authors
-
2021 Spotlight: Neural Algorithmic Reasoners are Implicit Planners »
Andreea-Ioana Deac · Petar Veličković · Ognjen Milinkovic · Pierre-Luc Bacon · Jian Tang · Mladen Nikolic -
2021 : Multi-task Learning with Domain Knowledge for Molecular Property Prediction »
Shengchao Liu · Meng Qu · Zuobai Zhang · Jian Tang -
2022 Poster: Discrete Compositional Representations as an Abstraction for Goal Conditioned Reinforcement Learning »
Riashat Islam · Hongyu Zang · Anirudh Goyal · Alex Lamb · Kenji Kawaguchi · Xin Li · Romain Laroche · Yoshua Bengio · Remi Tachet des Combes -
2022 Poster: Optimality and Stability in Non-Convex Smooth Games »
Guojun Zhang · Pascal Poupart · Yaoliang Yu -
2022 : GAUCHE: A Library for Gaussian Processes in Chemistry »
Ryan-Rhys Griffiths · Leo Klarner · Henry Moss · Aditya Ravuri · Sang Truong · Bojana Rankovic · Yuanqi Du · Arian Jamasb · Julius Schwartz · Austin Tripp · Gregory Kell · Anthony Bourached · Alex Chan · Jacob Moss · Chengzhi Guo · Alpha Lee · Philippe Schwaller · Jian Tang -
2022 : Attribute Controlled Dialogue Prompting »
Runcheng Liu · Ahmad Rashid · Ivan Kobyzev · Mehdi Rezaghoizadeh · Pascal Poupart -
2022 : Fifteen-minute Competition Overview Video »
Maartje Anne ter Hoeve · Mikhail Burtsev · Zoya Volovikova · Ziming Li · Yuxuan Sun · Shrestha Mohanty · Negar Arabzadeh · Mohammad Aliannejadi · Milagro Teruel · Marc-Alexandre Côté · Kavya Srinet · arthur szlam · Artem Zholus · Alexey Skrynnik · Aleksandr Panov · Ahmed Awadallah · Julia Kiseleva -
2022 : Geometric attacks on batch normalization »
Amur Ghose · Apurv Gupta · Yaoliang Yu · Pascal Poupart -
2023 Poster: Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning »
Hongyu Zang · Xin Li · Leiji Zhang · Yang Liu · Baigui Sun · Riashat Islam · Remi Tachet des Combes · Romain Laroche -
2023 Poster: Deep language networks: joint prompt training of stacked LLMs using variational inference »
Alessandro Sordoni · Eric Yuan · Marc-Alexandre Côté · Matheus Pereira · Adam Trischler · Ziang Xiao · Arian Hosseini · Friederike Niedtner · Nicolas Le Roux -
2023 Poster: Batchnorm Allows Unsupervised Radial Attacks »
Amur Ghose · Apurv Gupta · Yaoliang Yu · Pascal Poupart -
2023 Poster: Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets »
Zhang-Wei Hong · Aviral Kumar · Sathwik Karnik · Abhishek Bhandwaldar · Akash Srivastava · Joni Pajarinen · Romain Laroche · Abhishek Gupta · Pulkit Agrawal -
2023 Poster: An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient »
Yudong Luo · Guiliang Liu · Pascal Poupart · Yangchen Pan -
2023 Poster: Multi-Modal Inverse Constrained Reinforcement Learning from a Mixture of Demonstrations »
Guanren Qiao · Guiliang Liu · Pascal Poupart · Zhiqiang Xu -
2022 Spotlight: Optimality and Stability in Non-Convex Smooth Games »
Guojun Zhang · Pascal Poupart · Yaoliang Yu -
2022 Competition: IGLU: Interactive Grounded Language Understanding in a Collaborative Environment »
Julia Kiseleva · Alexey Skrynnik · Artem Zholus · Shrestha Mohanty · Negar Arabzadeh · Marc-Alexandre Côté · Mohammad Aliannejadi · Milagro Teruel · Ziming Li · Mikhail Burtsev · Maartje Anne ter Hoeve · Zoya Volovikova · Aleksandr Panov · Yuxuan Sun · arthur szlam · Ahmed Awadallah · Kavya Srinet -
2022 : Attribute Controlled Dialogue Prompting »
Runcheng Liu · Ahmad Rashid · Ivan Kobyzev · Mehdi Rezaghoizadeh · Pascal Poupart -
2022 Workshop: LaReL: Language and Reinforcement Learning »
Laetitia Teodorescu · Laura Ruis · Tristan Karch · Cédric Colas · Paul Barde · Jelena Luketina · Athul Jacob · Pratyusha Sharma · Edward Grefenstette · Jacob Andreas · Marc-Alexandre Côté -
2022 Workshop: Second Workshop on Efficient Natural Language and Speech Processing (ENLSP-II) »
Mehdi Rezagholizadeh · Peyman Passban · Yue Dong · Lili Mou · Pascal Poupart · Ali Ghodsi · Qun Liu -
2022 Poster: When does return-conditioned supervised learning work for offline reinforcement learning? »
David Brandfonbrener · Alberto Bietti · Jacob Buckman · Romain Laroche · Joan Bruna -
2022 Poster: Uncertainty-Aware Reinforcement Learning for Risk-Sensitive Player Evaluation in Sports Game »
Guiliang Liu · Yudong Luo · Oliver Schulte · Pascal Poupart -
2021 : Best Papers and Closing Remarks »
Ali Ghodsi · Pascal Poupart -
2021 : Panel Discussion »
Pascal Poupart · Ali Ghodsi · Luke Zettlemoyer · Sameer Singh · Kevin Duh · Yejin Choi · Lu Hou -
2021 : AI X Molecule »
Jian Tang -
2021 Workshop: Efficient Natural Language and Speech Processing (Models, Training, and Inference) »
Mehdi Rezaghoizadeh · Lili Mou · Yue Dong · Pascal Poupart · Ali Ghodsi · Qun Liu -
2021 : Opening Speech »
Pascal Poupart -
2021 Poster: Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs »
harsh satija · Philip Thomas · Joelle Pineau · Romain Laroche -
2021 : Multimodal Single-Cell Data Integration + Q&A »
Daniel Burkhardt · Smita Krishnaswamy · Malte Luecken · Debora Marks · Angela Pisco · Bastian Rieck · Jian Tang · Alexander Tong · Fabian Theis · Guy Wolf -
2021 Poster: Dr Jekyll & Mr Hyde: the strange case of off-policy policy updates »
Romain Laroche · Remi Tachet des Combes -
2021 Poster: Neural Algorithmic Reasoners are Implicit Planners »
Andreea-Ioana Deac · Petar Veličković · Ognjen Milinkovic · Pierre-Luc Bacon · Jian Tang · Mladen Nikolic -
2021 Poster: How to transfer algorithmic reasoning knowledge to learn new algorithms? »
Louis-Pascal Xhonneux · Andreea-Ioana Deac · Petar Veličković · Jian Tang -
2021 Poster: Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction »
Zhaocheng Zhu · Zuobai Zhang · Louis-Pascal Xhonneux · Jian Tang -
2021 Poster: Predicting Molecular Conformation via Dynamic Graph Score Matching »
Shitong Luo · Chence Shi · Minkai Xu · Jian Tang -
2021 Poster: Joint Modeling of Visual Objects and Relations for Scene Graph Generation »
Minghao Xu · Meng Qu · Bingbing Ni · Jian Tang -
2021 Poster: Quantifying and Improving Transferability in Domain Generalization »
Guojun Zhang · Han Zhao · Yaoliang Yu · Pascal Poupart -
2021 Poster: Learning Tree Interpretation from Object Representation for Deep Reinforcement Learning »
Guiliang Liu · Xiangyu Sun · Oliver Schulte · Pascal Poupart -
2020 Workshop: Wordplay: When Language Meets Games »
Prithviraj Ammanabrolu · Matthew Hausknecht · Xingdi Yuan · Marc-Alexandre Côté · Adam Trischler · Kory Mathewson @korymath · John Urbanek · Jason Weston · Mark Riedl -
2020 : Contributed Talk 4: Directional Graph Networks »
Dominique Beaini · Saro Passaro · Vincent Létourneau · Will Hamilton · Gabriele Corso · Pietro Liò -
2020 Workshop: Differential Geometry meets Deep Learning (DiffGeo4DL) »
Joey Bose · Emile Mathieu · Charline Le Lan · Ines Chami · Frederic Sala · Christopher De Sa · Maximilian Nickel · Christopher Ré · Will Hamilton -
2020 Poster: Graph Policy Network for Transferable Active Learning on Graphs »
Shengding Hu · Zheng Xiong · Meng Qu · Xingdi Yuan · Marc-Alexandre Côté · Zhiyuan Liu · Jian Tang -
2020 Poster: Learning Agent Representations for Ice Hockey »
Guiliang Liu · Oliver Schulte · Pascal Poupart · Mike Rudd · Mehrsan Javan -
2020 Poster: Adversarial Example Games »
Joey Bose · Gauthier Gidel · Hugo Berard · Andre Cianflone · Pascal Vincent · Simon Lacoste-Julien · Will Hamilton -
2020 Poster: Towards Interpretable Natural Language Understanding with Explanations as Latent Variables »
Wangchunshu Zhou · Jinyi Hu · Hanlin Zhang · Xiaodan Liang · Maosong Sun · Chenyan Xiong · Jian Tang -
2019 : Poster Session #2 »
Yunzhu Li · Peter Meltzer · Jianing Sun · Guillaume SALHA · Marin Vlastelica Pogančić · Chia-Cheng Liu · Fabrizio Frasca · Marc-Alexandre Côté · Vikas Verma · Abdulkadir CELIKKANAT · Pierluca D'Oro · Priyesh Vijayan · Maria Schuld · Petar Veličković · Kshitij Tayal · Yulong Pei · Hao Xu · Lei Chen · Pengyu Cheng · Ines Chami · Dongkwan Kim · Guilherme Gomes · Lukasz Maziarka · Jessica Hoffmann · Ron Levie · Antonia Gogoglou · Shunwang Gong · Federico Monti · Wenlin Wang · Yan Leng · Salvatore Vivona · Daniel Flam-Shepherd · Chester Holtz · Li Zhang · MAHMOUD KHADEMI · I-Chung Hsieh · Aleksandar Stanić · Ziqiao Meng · Yuhang Jiao -
2019 : Opening remarks »
Will Hamilton -
2019 Workshop: Graph Representation Learning »
Will Hamilton · Rianne van den Berg · Michael Bronstein · Stefanie Jegelka · Thomas Kipf · Jure Leskovec · Renjie Liao · Yizhou Sun · Petar Veličković -
2019 Poster: vGraph: A Generative Model for Joint Community Detection and Node Representation Learning »
Fan-Yun Sun · Meng Qu · Jordan Hoffmann · Chin-Wei Huang · Jian Tang -
2019 Poster: Unsupervised State Representation Learning in Atari »
Ankesh Anand · Evan Racah · Sherjil Ozair · Yoshua Bengio · Marc-Alexandre Côté · R Devon Hjelm -
2019 Poster: Probabilistic Logic Neural Networks for Reasoning »
Meng Qu · Jian Tang -
2019 Poster: Metalearned Neural Memory »
Tsendsuren Munkhdalai · Alessandro Sordoni · TONG WANG · Adam Trischler -
2019 Poster: Efficient Graph Generation with Graph Recurrent Attention Networks »
Renjie Liao · Yujia Li · Yang Song · Shenlong Wang · Will Hamilton · David Duvenaud · Raquel Urtasun · Richard Zemel -
2018 : Introducing "First TextWorld Problems": a text-based game competition »
Marc-Alexandre Côté -
2018 : Opening Remarks »
Adam Trischler -
2018 Workshop: Reinforcement Learning under Partial Observability »
Joni Pajarinen · Chris Amato · Pascal Poupart · David Hsu -
2018 Workshop: Wordplay: Reinforcement and Language Learning in Text-based Games »
Adam Trischler · Angeliki Lazaridou · Yonatan Bisk · Wendy Tay · Nate Kushman · Marc-Alexandre Côté · Alessandro Sordoni · Daniel Ricks · Tom Zahavy · Hal Daumé III -
2018 Poster: Deep Homogeneous Mixture Models: Representation, Separation, and Approximation »
Priyank Jaini · Pascal Poupart · Yaoliang Yu -
2018 Poster: Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks »
Agastya Kalra · Abdullah Rashwan · Wei-Shou Hsu · Pascal Poupart · Prashant Doshi · George Trimponias -
2018 Poster: Unsupervised Video Object Segmentation for Deep Reinforcement Learning »
Vikash Goel · Jameson Weng · Pascal Poupart -
2018 Poster: Monte-Carlo Tree Search for Constrained POMDPs »
Jongmin Lee · Geon-Hyeong Kim · Pascal Poupart · Kee-Eung Kim -
2018 Poster: Towards Text Generation with Adversarially Learned Neural Outlines »
Sandeep Subramanian · Sai Rajeswar Mudumba · Alessandro Sordoni · Adam Trischler · Aaron Courville · Chris Pal -
2018 Demonstration: TextWorld: A Learning Environment for Text-based Games »
Marc-Alexandre Côté · Wendy Tay · Xingdi Yuan -
2017 Poster: Hybrid Reward Architecture for Reinforcement Learning »
Harm Van Seijen · Mehdi Fatemi · Romain Laroche · Joshua Romoff · Tavian Barnes · Jeffrey Tsang -
2017 Poster: Plan, Attend, Generate: Planning for Sequence-to-Sequence Models »
Caglar Gulcehre · Francis Dutil · Adam Trischler · Yoshua Bengio -
2017 Poster: Z-Forcing: Training Stochastic Recurrent Networks »
Anirudh Goyal · Alessandro Sordoni · Marc-Alexandre Côté · Nan Rosemary Ke · Yoshua Bengio -
2016 Poster: Online Bayesian Moment Matching for Topic Modeling with Unknown Number of Topics »
Wei-Shou Hsu · Pascal Poupart -
2016 Poster: A Unified Approach for Learning the Parameters of Sum-Product Networks »
Han Zhao · Pascal Poupart · Geoffrey Gordon