Timezone: »

Transformers are Sample-Efficient World Models
Vincent Micheli · Eloi Alonso · François Fleuret

Fri Dec 09 11:30 AM -- 11:45 AM (PST) @
Event URL: https://openreview.net/forum?id=WIimAcYcZ5U »

Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems. Recently, many model-based methods have been designed to address this issue, with learning in the imagination of a world model being one of the most prominent approaches. However, while virtually unlimited interaction with a simulated environment sounds appealing, the world model has to be accurate over extended periods of time. Motivated by the success of Transformers in sequence modeling tasks, we introduce IRIS, a data-efficient agent that learns in a world model composed of a discrete autoencoder and an autoregressive Transformer. With the equivalent of only two hours of gameplay in the Atari 100k benchmark, IRIS achieves a mean human normalized score of 1.046, and outperforms humans on 10 out of 26 games, setting a new state of the art for methods without lookahead search. To foster future research on Transformers and world models for sample-efficient reinforcement learning, we release our codebase at this https URL. For the review process, we provide the code and visualizations in the supplementary materials.

Author Information

Vincent Micheli (University of Geneva, Switzerland)
Eloi Alonso (University of Geneva)
François Fleuret (University of Geneva)

François Fleuret got a PhD in Mathematics from INRIA and the University of Paris VI in 2000, and an Habilitation degree in Mathematics from the University of Paris XIII in 2006. He is Full Professor in the department of Computer Science at the University of Geneva, and Adjunct Professor in the School of Engineering of the École Polytechnique Fédérale de Lausanne. He has published more than 80 papers in peer-reviewed international conferences and journals. He is Associate Editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence, serves as Area Chair for NeurIPS, AAAI, and ICCV, and in the program committee of many top-tier international conferences in machine learning and computer vision. He was or is expert for multiple funding agencies. He is the inventor of several patents in the field of machine learning, and co-founder of Neural Concept SA, a company specializing in the development and commercialization of deep learning solutions for engineering design. His main research interest is machine learning, with a particular focus on computational aspects and sample efficiency.

More from the Same Authors