Timezone: »
This paper proposes a novel deep reinforcement learning (RL) architecture, called Value Prediction Network (VPN), which integrates model-free and model-based RL methods into a single neural network. In contrast to typical model-based RL methods, VPN learns a dynamics model whose abstract states are trained to make option-conditional predictions of future values (discounted sum of rewards) rather than of future observations. Our experimental results show that VPN has several advantages over both model-free and model-based baselines in a stochastic environment where careful planning is required but building an accurate observation-prediction model is difficult. Furthermore, VPN outperforms Deep Q-Network (DQN) on several Atari games even with short-lookahead planning, demonstrating its potential as a new way of learning a good state representation.
Author Information
Junhyuk Oh (DeepMind)
Satinder Singh (University of Michigan)
Honglak Lee (Google / U. Michigan)
More from the Same Authors
-
2021 : Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks »
Yijie Guo · Qiucheng Wu · Honglak Lee -
2021 : Fast Inference and Transfer of Compositional Task for Few-shot Task Generalization »
Sungryull Sohn · Hyunjae Woo · Jongwook Choi · Izzeddin Gur · Aleksandra Faust · Honglak Lee -
2021 : Learning Parameterized Task Structure for Generalization to Unseen Entities »
Anthony Liu · Sungryull Sohn · Honglak Lee -
2021 : SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning »
Jongjin Park · Younggyo Seo · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2021 : Learning compositional tasks from language instructions »
Lajanugen Logeswaran · Wilka Carvalho · Honglak Lee -
2022 : Allele-conditional attention mechanism for HLA-peptide complex binding affinity prediction »
Rodrigo Hormazabal · Doyeong Hwang · Kiyoung Kim · Sehui Han · Kyunghoon Bae · Honglak Lee -
2022 : Dynamics-Augmented Decision Transformer for Offline Dynamics Generalization »
Changyeon Kim · Junsu Kim · Younggyo Seo · Kimin Lee · Honglak Lee · Jinwoo Shin -
2022 : Learning Exploration Policies with View-based Intrinsic Rewards »
Yijie Guo · Yao Fu · Run Peng · Honglak Lee -
2022 : ReSPack: A Large-Scale Rectilinear Steiner Tree Packing Data Generator and Benchmark »
Kanghoon Lee · Youngjoon Park · Han-Seul Jeong · Deunsol Yoon · Sunghoon Hong · Sungryull Sohn · Minu Kim · Hanbum Ko · Moontae Lee · Honglak Lee · Kyunghoon Kim · Euihyuk Kim · Seonggeon Cho · Jaesang Min · Woohyung Lim -
2022 Poster: Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching »
Byoungjip Kim · Sungik Choi · Dasol Hwang · Moontae Lee · Honglak Lee -
2022 Poster: Pure Transformers are Powerful Graph Learners »
Jinwoo Kim · Dat Nguyen · Seonwoo Min · Sungjun Cho · Moontae Lee · Honglak Lee · Seunghoon Hong -
2022 Poster: OpenSRH: optimizing brain tumor surgery using intraoperative stimulated Raman histology »
Cheng Jiang · Asadur Chowdury · Xinhai Hou · Akhil Kondepudi · Christian Freudiger · Kyle Conway · Sandra Camelo-Piragua · Daniel Orringer · Honglak Lee · Todd Hollon -
2022 Poster: Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost »
Sungjun Cho · Seonwoo Min · Jinwoo Kim · Moontae Lee · Honglak Lee · Seunghoon Hong -
2022 Poster: UniCLIP: Unified Framework for Contrastive Language-Image Pre-training »
Janghyeon Lee · Jongsuk Kim · Hyounguk Shon · Bumsoo Kim · Seung Hwan Kim · Honglak Lee · Junmo Kim -
2022 Poster: CEDe: A collection of expert-curated datasets with atom-level entity annotations for Optical Chemical Structure Recognition »
Rodrigo Hormazabal · Changyoung Park · Soonyoung Lee · Sehui Han · Yeonsik Jo · Jaewan Lee · Ahra Jo · Seung Hwan Kim · Jaegul Choo · Moontae Lee · Honglak Lee -
2022 Expo Talk Panel: Towards learning agents for solving complex real-world tasks »
Honglak Lee -
2021 Poster: Why Do Better Loss Functions Lead to Less Transferable Features? »
Simon Kornblith · Ting Chen · Honglak Lee · Mohammad Norouzi -
2021 Poster: Improving Transferability of Representations via Augmentation-Aware Self-Supervision »
Hankook Lee · Kibok Lee · Kimin Lee · Honglak Lee · Jinwoo Shin -
2021 Poster: Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning »
Christopher Hoang · Sungryull Sohn · Jongwook Choi · Wilka Carvalho · Honglak Lee -
2021 Poster: Environment Generation for Zero-Shot Compositional Reinforcement Learning »
Izzeddin Gur · Natasha Jaques · Yingjie Miao · Jongwook Choi · Manoj Tiwari · Honglak Lee · Aleksandra Faust -
2020 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · Chelsea Finn · Joelle Pineau · David Silver · Satinder Singh · Coline Devin · Misha Laskin · Kimin Lee · Janarthanan Rajendran · Vivek Veeriah -
2020 Poster: Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards »
Yijie Guo · Jongwook Choi · Marcin Moczulski · Shengyu Feng · Samy Bengio · Mohammad Norouzi · Honglak Lee -
2020 Poster: Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning »
Guangxiang Zhu · Minghao Zhang · Honglak Lee · Chongjie Zhang -
2019 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · Chelsea Finn · Joelle Pineau · David Silver · Satinder Singh · Joshua Achiam · Carlos Florensa · Christopher Grimm · Haoran Tang · Vivek Veeriah -
2019 Poster: Discovery of Useful Questions as Auxiliary Tasks »
Vivek Veeriah · Matteo Hessel · Zhongwen Xu · Janarthanan Rajendran · Richard L Lewis · Junhyuk Oh · Hado van Hasselt · David Silver · Satinder Singh -
2019 Poster: No-Press Diplomacy: Modeling Multi-Agent Gameplay »
Philip Paquette · Yuchen Lu · SETON STEVEN BOCCO · Max Smith · Satya O.-G. · Jonathan K. Kummerfeld · Joelle Pineau · Satinder Singh · Aaron Courville -
2019 Poster: High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks »
Ruben Villegas · Arkanath Pathak · Harini Kannan · Dumitru Erhan · Quoc V Le · Honglak Lee -
2018 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · David Silver · Satinder Singh · Joelle Pineau · Joshua Achiam · Rein Houthooft · Aravind Srinivas -
2018 Poster: A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks »
Kimin Lee · Kibok Lee · Honglak Lee · Jinwoo Shin -
2018 Spotlight: A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks »
Kimin Lee · Kibok Lee · Honglak Lee · Jinwoo Shin -
2018 Poster: Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies »
Sungryull Sohn · Junhyuk Oh · Honglak Lee -
2018 Poster: On Learning Intrinsic Rewards for Policy Gradient Methods »
Zeyu Zheng · Junhyuk Oh · Satinder Singh -
2018 Poster: Learning Hierarchical Semantic Image Manipulation through Structured Representations »
Seunghoon Hong · Xinchen Yan · Thomas Huang · Honglak Lee -
2018 Poster: Completing State Representations using Spectral Learning »
Nan Jiang · Alex Kulesza · Satinder Singh -
2017 : Afternoon Panel discussion »
Brian Skyrms · Satinder Singh · Jacob Andreas -
2017 : "Language Emergence as Boundedly Optimal Control" »
Satinder Singh -
2017 : Invited Talk 5 »
Honglak Lee -
2017 : Minimax-Regret Querying on Side Effects in Factored Markov Decision Processes »
Satinder Singh -
2017 Workshop: Learning Disentangled Features: from Perception to Control »
Emily Denton · Siddharth Narayanaswamy · Tejas Kulkarni · Honglak Lee · Diane Bouchacourt · Josh Tenenbaum · David Pfau -
2017 : Invited Talk - Satindar Singh »
Satinder Singh -
2017 Symposium: Deep Reinforcement Learning »
Pieter Abbeel · Yan Duan · David Silver · Satinder Singh · Junhyuk Oh · Rein Houthooft -
2017 Poster: Repeated Inverse Reinforcement Learning »
Kareem Amin · Nan Jiang · Satinder Singh -
2017 Spotlight: Repeated Inverse Reinforcement Learning »
Kareem Amin · Nan Jiang · Satinder Singh -
2016 : Junhyuk Oh »
Junhyuk Oh -
2016 Workshop: Deep Reinforcement Learning »
David Silver · Satinder Singh · Pieter Abbeel · Peter Chen -
2016 Poster: Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision »
Xinchen Yan · Jimei Yang · Ersin Yumer · Yijie Guo · Honglak Lee -
2016 Poster: Learning What and Where to Draw »
Scott E Reed · Zeynep Akata · Santosh Mohan · Samuel Tenka · Bernt Schiele · Honglak Lee -
2016 Oral: Learning What and Where to Draw »
Scott E Reed · Zeynep Akata · Santosh Mohan · Samuel Tenka · Bernt Schiele · Honglak Lee -
2015 : Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning »
Honglak Lee -
2015 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · John Schulman · Satinder Singh · David Silver -
2015 Symposium: Deep Learning Symposium »
Yoshua Bengio · Marc'Aurelio Ranzato · Honglak Lee · Max Welling · Andrew Y Ng -
2015 Poster: Deep Visual Analogy-Making »
Scott E Reed · Yi Zhang · Yuting Zhang · Honglak Lee -
2015 Poster: Action-Conditional Video Prediction using Deep Networks in Atari Games »
Junhyuk Oh · Xiaoxiao Guo · Honglak Lee · Richard L Lewis · Satinder Singh -
2015 Spotlight: Action-Conditional Video Prediction using Deep Networks in Atari Games »
Junhyuk Oh · Xiaoxiao Guo · Honglak Lee · Richard L Lewis · Satinder Singh -
2015 Oral: Deep Visual Analogy-Making »
Scott E Reed · Yi Zhang · Yuting Zhang · Honglak Lee -
2015 Poster: Learning Structured Output Representation using Deep Conditional Generative Models »
Kihyuk Sohn · Honglak Lee · Xinchen Yan -
2015 Poster: Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis »
Jimei Yang · Scott E Reed · Ming-Hsuan Yang · Honglak Lee -
2014 Workshop: Representation and Learning Methods for Complex Outputs »
Richard Zemel · Dale Schuurmans · Kilian Q Weinberger · Yuhong Guo · Jia Deng · Francesco Dinuzzo · Hal Daumé III · Honglak Lee · Noah A Smith · Richard Sutton · Jiaqian YU · Vitaly Kuznetsov · Luke Vilnis · Hanchen Xiong · Calvin Murdock · Thomas Unterthiner · Jean-Francis Roy · Martin Renqiang Min · Hichem SAHBI · Fabio Massimo Zanzotto -
2014 Poster: Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning »
Xiaoxiao Guo · Satinder Singh · Honglak Lee · Richard L Lewis · Xiaoshi Wang -
2014 Poster: Improved Multimodal Deep Learning with Variation of Information »
Kihyuk Sohn · Wenling Shang · Honglak Lee -
2013 Poster: Reward Mapping for Transfer in Long-Lived Agents »
Xiaoxiao Guo · Satinder Singh · Richard L Lewis -
2013 Session: Oral Session 9 »
Satinder Singh -
2013 Poster: Robust Image Denoising with Multi-Column Deep Neural Networks »
Forest Agostinelli · Michael R Anderson · Honglak Lee -
2012 Poster: Learning to Align from Scratch »
Gary B Huang · Marwan A Mattar · Honglak Lee · Erik Learned-Miller -
2010 Workshop: Deep Learning and Unsupervised Feature Learning »
Honglak Lee · Marc'Aurelio Ranzato · Yoshua Bengio · Geoffrey E Hinton · Yann LeCun · Andrew Y Ng -
2010 Poster: Reward Design via Online Gradient Ascent »
Jonathan D Sorg · Satinder Singh · Richard L Lewis -
2009 Poster: Unsupervised feature learning for audio classification using convolutional deep belief networks »
Honglak Lee · Peter Pham · Yan Largman · Andrew Y Ng -
2008 Poster: Simple Local Models for Complex Dynamical Systems »
Erik Talvitie · Satinder Singh -
2008 Oral: Simple Local Models for Complex Dynamical Systems »
Erik Talvitie · Satinder Singh -
2007 Oral: Exponential Family Predictive Representations of State »
David Wingate · Satinder Singh -
2007 Poster: Exponential Family Predictive Representations of State »
David Wingate · Satinder Singh -
2007 Poster: Sparse deep belief net model for visual area V2 »
Honglak Lee · Ekanadham Chaitanya · Andrew Y Ng -
2006 Poster: Efficient sparse coding algorithms, end-stopping and nCRF surround suppression »
Honglak Lee · Alexis Battle · Raina Rajat · Andrew Y Ng