Timezone: »
Many reinforcement learning (RL) environments consist of independent entities that interact sparsely. In such environments, RL agents have only limited influence over other entities in any particular situation. Our idea in this work is that learning can be efficiently guided by knowing when and what the agent can influence with its actions. To achieve this, we introduce a measure of situation-dependent causal influence based on conditional mutual information and show that it can reliably detect states of influence. We then propose several ways to integrate this measure into RL algorithms to improve exploration and off-policy learning. All modified algorithms show strong increases in data efficiency on robotic manipulation tasks.
Author Information
Maximilian Seitzer (Max Planck Institute for Intelligent Systems, Max-Planck Institute)
Bernhard Schölkopf (MPI for Intelligent Systems, Tübingen)
Georg Martius (IST Austria)
More from the Same Authors
-
2021 Spotlight: Iterative Teaching by Label Synthesis »
Weiyang Liu · Zhen Liu · Hanchen Wang · Liam Paull · Bernhard Schölkopf · Adrian Weller -
2021 Spotlight: DiBS: Differentiable Bayesian Structure Learning »
Lars Lorch · Jonas Rothfuss · Bernhard Schölkopf · Andreas Krause -
2022 : A Causal Framework to Quantify Robustness of Mathematical Reasoning with Language Models »
Alessandro Stolfo · Zhijing Jin · Kumar Shridhar · Bernhard Schölkopf · Mrinmaya Sachan -
2021 : Boxhead: A Dataset for Learning Hierarchical Representations »
Yukun Chen · Andrea Dittadi · Frederik Träuble · Stefan Bauer · Bernhard Schölkopf -
2021 Poster: Dynamic Inference with Neural Interpreters »
Nasim Rahaman · Muhammad Waleed Gondal · Shruti Joshi · Peter Gehler · Yoshua Bengio · Francesco Locatello · Bernhard Schölkopf -
2021 Poster: Hierarchical Reinforcement Learning with Timed Subgoals »
Nico Gürtler · Dieter Büchler · Georg Martius -
2021 Poster: Planning from Pixels in Environments with Combinatorially Hard Search Spaces »
Marco Bagatella · Miroslav Olšák · Michal Rolínek · Georg Martius -
2021 Poster: Independent mechanism analysis, a new concept? »
Luigi Gresele · Julius von Kügelgen · Vincent Stimper · Bernhard Schölkopf · Michel Besserve -
2021 Poster: Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains »
Christian Gumbsch · Martin V. Butz · Georg Martius -
2021 Poster: Iterative Teaching by Label Synthesis »
Weiyang Liu · Zhen Liu · Hanchen Wang · Liam Paull · Bernhard Schölkopf · Adrian Weller -
2021 Poster: The Inductive Bias of Quantum Kernels »
Jonas Kübler · Simon Buchholz · Bernhard Schölkopf -
2021 Poster: Backward-Compatible Prediction Updates: A Probabilistic Approach »
Frederik Träuble · Julius von Kügelgen · Matthäus Kleindessner · Francesco Locatello · Bernhard Schölkopf · Peter Gehler -
2021 Poster: Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style »
Julius von Kügelgen · Yash Sharma · Luigi Gresele · Wieland Brendel · Bernhard Schölkopf · Michel Besserve · Francesco Locatello -
2021 Poster: DiBS: Differentiable Bayesian Structure Learning »
Lars Lorch · Jonas Rothfuss · Bernhard Schölkopf · Andreas Krause -
2021 Poster: Regret Bounds for Gaussian-Process Optimization in Large Domains »
Manuel Wuethrich · Bernhard Schölkopf · Andreas Krause -
2019 : Bernhard Schölkopf »
Bernhard Schölkopf -
2018 : Learning Independent Mechanisms »
Bernhard Schölkopf