Timezone: »
Exploration in reinforcement learning through intrinsic rewards has previously been addressed by approaches based on state novelty or artificial curiosity. In partially observable settings where observations look alike, state novelty can lead to intrinsic reward vanishing prematurely. On the other hand, curiosity-based approaches require modeling precise environment dynamics which are potentially quite complex. Here we propose random curiosity with general value functions (RC-GVF), an intrinsic reward function that connects state novelty and artificial curiosity. Instead of predicting the entire environment dynamics, RC-GVF predicts temporally extended values through general value functions (GVFs) and uses the prediction error as an intrinsic reward. In this way, our approach generalizes a popular approach called random network distillation (RND) by encouraging behavioral diversity and reduces the need for additional maximum entropy regularization. Our experiments on four procedurally generated partially observable environments indicate that our approach is competitive to RND and could be beneficial in environments that require behavioural exploration.
Author Information
Aditya Ramesh (IDSIA)
Louis Kirsch (The Swiss AI Lab IDSIA)
Sjoerd van Steenkiste (Google Research)
Jürgen Schmidhuber (Swiss AI Lab, IDSIA (USI & SUPSI); NNAISENSE; KAUST)
Since age 15 or so, the main goal of professor Jürgen Schmidhuber has been to build a self-improving Artificial Intelligence (AI) smarter than himself, then retire. His lab's Deep Learning Neural Networks based on ideas published in the "Annus Mirabilis" 1990-1991 have revolutionised machine learning and AI. By the mid 2010s, they were on 3 billion devices, and used billions of times per day through users of the world's most valuable public companies, e.g., for greatly improved (CTC-LSTM-based) speech recognition on all Android phones, greatly improved machine translation through Google Translate and Facebook (over 4 billion LSTM-based translations per day), Apple's Siri and Quicktype on all iPhones, the answers of Amazon's Alexa, and numerous other applications. In 2011, his team was the first to win official computer vision contests through deep neural nets, with superhuman performance. In 2012, they had the first deep NN to win a medical imaging contest (on cancer detection). All of this attracted enormous interest from industry. His research group also established the fields of mathematically rigorous universal AI and recursive self-improvement in metalearning machines that learn to learn (since 1987). In 1990, he introduced unsupervised adversarial neural networks that fight each other in a minimax game to achieve artificial curiosity (GANs are a special case). In 1991, he introduced very deep learning through unsupervised pre-training, and neural fast weight programmers formally equivalent to what's now called linear Transformers. His formal theory of creativity & curiosity & fun explains art, science, music, and humor. He also generalized algorithmic information theory and the many-worlds theory of physics, and introduced the concept of Low-Complexity Art, the information age's extreme form of minimal art. He is recipient of numerous awards, author of over 350 peer-reviewed papers, and Chief Scientist of the company NNAISENSE, which aims at building the first practical general purpose AI. He is a frequent keynote speaker, and advising various governments on AI strategies.
More from the Same Authors
-
2020 : Meta-Learning Backpropagation And Improving It »
Louis Kirsch -
2021 : Learning Adaptive Control Flow in Transformers for Improved Systematic Generalization »
Róbert Csordás · Kazuki Irie · Jürgen Schmidhuber -
2021 : Augmenting Classic Algorithms with Neural Components for Strong Generalisation on Ambiguous and High-Dimensional Data »
Imanol Schlag · Jürgen Schmidhuber -
2021 : Improving Baselines in the Wild »
Kazuki Irie · Imanol Schlag · Róbert Csordás · Jürgen Schmidhuber -
2021 : A Modern Self-Referential Weight Matrix That Learns to Modify Itself »
Kazuki Irie · Imanol Schlag · Róbert Csordás · Jürgen Schmidhuber -
2021 : Unsupervised Learning of Temporal Abstractions using Slot-based Transformers »
Anand Gopalakrishnan · Kazuki Irie · Jürgen Schmidhuber · Sjoerd van Steenkiste -
2021 : Introducing Symmetries to Black Box Meta Reinforcement Learning »
Louis Kirsch · Sebastian Flennerhag · Hado van Hasselt · Abram Friesen · Junhyuk Oh · Yutian Chen -
2021 : Introducing Symmetries to Black Box Meta Reinforcement Learning »
Louis Kirsch · Sebastian Flennerhag · Hado van Hasselt · Abram Friesen · Junhyuk Oh · Yutian Chen -
2021 : Unsupervised Learning of Temporal Abstractions using Slot-based Transformers »
Anand Gopalakrishnan · Kazuki Irie · Jürgen Schmidhuber · Sjoerd van Steenkiste -
2022 : Learning to Control Rapidly Changing Synaptic Connections: An Alternative Type of Memory in Sequence Processing Artificial Neural Networks »
Kazuki Irie · Jürgen Schmidhuber -
2022 : Meta-Learning General-Purpose Learning Algorithms with Transformers »
Louis Kirsch · Luke Metz · James Harrison · Jascha Sohl-Dickstein -
2022 : On Narrative Information and the Distillation of Stories »
Dylan Ashley · Vincent Herrmann · Zachary Friggstad · Jürgen Schmidhuber -
2022 : Meta-Learning General-Purpose Learning Algorithms with Transformers »
Louis Kirsch · Luke Metz · James Harrison · Jascha Sohl-Dickstein -
2022 : The Benefits of Model-Based Generalization in Reinforcement Learning »
Kenny Young · Aditya Ramesh · Louis Kirsch · Jürgen Schmidhuber -
2022 : Learning gaze control, external attention, and internal attention since 1990-91 »
Jürgen Schmidhuber -
2022 Poster: Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules »
Kazuki Irie · Francesco Faccio · Jürgen Schmidhuber -
2022 Poster: Exploring through Random Curiosity with General Value Functions »
Aditya Ramesh · Louis Kirsch · Sjoerd van Steenkiste · Jürgen Schmidhuber -
2021 : Panel Discussion 1 »
Megan Peters · Jürgen Schmidhuber · Simona Ghetti · Nick Roy · Oiwi Parker Jones · Ingmar Posner -
2021 : Credit Assignment & Meta-Learning in a Single Lifelong Trial »
Jürgen Schmidhuber -
2021 Poster: Going Beyond Linear Transformers with Recurrent Fast Weight Programmers »
Kazuki Irie · Imanol Schlag · Róbert Csordás · Jürgen Schmidhuber -
2021 Poster: Meta Learning Backpropagation And Improving It »
Louis Kirsch · Jürgen Schmidhuber -
2020 : Q/A for invited talk #4 »
Louis Kirsch -
2020 : General meta-learning »
Louis Kirsch -
2020 Workshop: Object Representations for Learning and Reasoning »
William Agnew · Rim Assouel · Michael Chang · Antonia Creswell · Eliza Kosoy · Aravind Rajeswaran · Sjoerd van Steenkiste -
2019 : Panel Discussion »
Jacob Andreas · Edward Gibson · Stefan Lee · Noga Zaslavsky · Jason Eisner · Jürgen Schmidhuber -
2019 : Poster session »
Sebastian Farquhar · Erik Daxberger · Andreas Look · Matt Benatan · Ruiyi Zhang · Marton Havasi · Fredrik Gustafsson · James A Brofos · Nabeel Seedat · Micha Livne · Ivan Ustyuzhaninov · Adam Cobb · Felix D McGregor · Patrick McClure · Tim R. Davidson · Gaurush Hiranandani · Sanjeev Arora · Masha Itkina · Didrik Nielsen · William Harvey · Matias Valdenegro-Toro · Stefano Peluchetti · Riccardo Moriconi · Tianyu Cui · Vaclav Smidl · Taylan Cemgil · Jack Fitzsimons · He Zhao · · mariana vargas vieyra · Apratim Bhattacharyya · Rahul Sharma · Geoffroy Dubourg-Felonneau · Jonathan Warrell · Slava Voloshynovskiy · Mihaela Rosca · Jiaming Song · Andrew Ross · Homa Fashandi · Ruiqi Gao · Hooshmand Shokri Razaghi · Joshua Chang · Zhenzhong Xiao · Vanessa Boehm · Giorgio Giannone · Ranganath Krishnan · Joe Davison · Arsenii Ashukha · Jeremiah Liu · Sicong (Sheldon) Huang · Evgenii Nikishin · Sunho Park · Nilesh Ahuja · Mahesh Subedar · · Artyom Gadetsky · Jhosimar Arias Figueroa · Tim G. J. Rudner · Waseem Aslam · Adrián Csiszárik · John Moberg · Ali Hebbal · Kathrin Grosse · Pekka Marttinen · Bang An · Hlynur Jónsson · Samuel Kessler · Abhishek Kumar · Mikhail Figurnov · Omesh Tickoo · Steindor Saemundsson · Ari Heljakka · Dániel Varga · Niklas Heim · Simone Rossi · Max Laves · Waseem Gharbieh · Nicholas Roberts · Luis Armando Pérez Rey · Matthew Willetts · Prithvijit Chakrabarty · Sumedh Ghaisas · Carl Shneider · Wray Buntine · Kamil Adamczewski · Xavier Gitiaux · Suwen Lin · Hao Fu · Gunnar Rätsch · Aidan Gomez · Erik Bodin · Dinh Phung · Lennart Svensson · Juliano Tusi Amaral Laganá Pinto · Milad Alizadeh · Jianzhun Du · Kevin Murphy · Beatrix Benkő · Shashaank Vattikuti · Jonathan Gordon · Christopher Kanan · Sontje Ihler · Darin Graham · Michael Teng · Louis Kirsch · Tomas Pevny · Taras Holotyak -
2019 Poster: Are Disentangled Representations Helpful for Abstract Visual Reasoning? »
Sjoerd van Steenkiste · Francesco Locatello · Jürgen Schmidhuber · Olivier Bachem -
2018 : Invited Speaker #4 Juergen Schmidhuber »
Jürgen Schmidhuber -
2018 Poster: Modular Networks: Learning to Decompose Neural Computation »
Louis Kirsch · Julius Kunze · David Barber -
2018 Poster: Recurrent World Models Facilitate Policy Evolution »
David Ha · Jürgen Schmidhuber -
2018 Oral: Recurrent World Models Facilitate Policy Evolution »
David Ha · Jürgen Schmidhuber -
2018 Poster: Learning to Reason with Third Order Tensor Products »
Imanol Schlag · Jürgen Schmidhuber -
2017 : Morning panel discussion »
Jürgen Schmidhuber · Noah Goodman · Anca Dragan · Pushmeet Kohli · Dhruv Batra -
2017 : HRL with gradient-based subgoal generators, asymptotically optimal incremental problem solvers, various meta-learners, and PowerPlay (Jürgen Schmidhuber) »
Jürgen Schmidhuber -
2017 : Relational neural expectation maximization »
Sjoerd van Steenkiste -
2017 : Invited Talk »
Jürgen Schmidhuber -
2017 Poster: Neural Expectation Maximization »
Klaus Greff · Sjoerd van Steenkiste · Jürgen Schmidhuber -
2016 : Juergen Schmidhuber (Scientific Director of the Swiss AI Lab IDSIA) »
Jürgen Schmidhuber -
2016 Symposium: Recurrent Neural Networks and Other Machines that Learn Algorithms »
Jürgen Schmidhuber · Sepp Hochreiter · Alex Graves · Rupesh K Srivastava -
2016 Poster: Tagger: Deep Unsupervised Perceptual Grouping »
Klaus Greff · Antti Rasmus · Mathias Berglund · Hotloo Xiranood · Harri Valpola · Jürgen Schmidhuber -
2015 : Deep Learning RNNaissance »
Jürgen Schmidhuber -
2015 : On General Problem Solving and How to Learn an Algorithm »
Jürgen Schmidhuber -
2015 Poster: Training Very Deep Networks »
Rupesh K Srivastava · Klaus Greff · Jürgen Schmidhuber -
2015 Spotlight: Training Very Deep Networks »
Rupesh K Srivastava · Klaus Greff · Jürgen Schmidhuber -
2015 Poster: Parallel Multi-Dimensional LSTM, With Application to Fast Biomedical Volumetric Image Segmentation »
Marijn F Stollenga · Wonmin Byeon · Marcus Liwicki · Jürgen Schmidhuber -
2014 Poster: Deep Networks with Internal Selective Attention through Feedback Connections »
Marijn F Stollenga · Jonathan Masci · Faustino Gomez · Jürgen Schmidhuber -
2013 Poster: Compete to Compute »
Rupesh K Srivastava · Jonathan Masci · Sohrob Kazerounian · Faustino Gomez · Jürgen Schmidhuber -
2012 Poster: Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images »
Dan Ciresan · Alessandro Giusti · luca Maria Gambardella · Jürgen Schmidhuber -
2010 Poster: Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices »
Yi Sun · Faustino Gomez · Jürgen Schmidhuber -
2008 Poster: Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks »
Alex Graves · Jürgen Schmidhuber -
2008 Spotlight: Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks »
Alex Graves · Jürgen Schmidhuber -
2007 Poster: Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks »
Alex Graves · Santiago Fernandez · Marcus Liwicki · Horst Bunke · Jürgen Schmidhuber