Timezone: »
Abstract object properties and their relations are deeply rooted in human common sense, allowing people to predict the dynamics of the world even in situations that are novel but governed by familiar laws of physics. Standard machine learning models in model-based reinforcement learning are inadequate to generalize in this way. Inspired by the classic framework of noisy indeterministic deictic (NID) rules, we introduce here Neural NID, a method that learns abstract object properties and relations between objects with a suitably regularized graph neural network. We validate the greater generalization capability of Neural NID on simple benchmarks specifically designed to assess the transition dynamics learned by the model.
Author Information
Luca Viano (EPFL)
Johanni Brea (Swiss Federal Institute of Technology Lausanne)
More from the Same Authors
-
2023 Poster: Alternation makes the adversary weaker in two-player games »
Volkan Cevher · Ashok Cutkosky · Ali Kavis · Georgios Piliouras · Stratis Skoulakis · Luca Viano -
2023 Poster: Should Under-parameterized Student Networks Copy or Average Teacher Weights? »
Berfin Simsek · Amire Bendjeddou · Wulfram Gerstner · Johanni Brea -
2022 Poster: Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning »
Paul Rolland · Luca Viano · Norman Schürhoff · Boris Nikolov · Volkan Cevher -
2022 Poster: Proximal Point Imitation Learning »
Luca Viano · Angeliki Kamoutsi · Gergely Neu · Igor Krawczuk · Volkan Cevher -
2022 Poster: Understanding Deep Neural Function Approximation in Reinforcement Learning via $\epsilon$-Greedy Exploration »
Fanghui Liu · Luca Viano · Volkan Cevher -
2022 Poster: Kernel Memory Networks: A Unifying Framework for Memory Modeling »
Georgios Iatropoulos · Johanni Brea · Wulfram Gerstner -
2021 Poster: Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch »
Luca Viano · Yu-Ting Huang · Parameswaran Kamalaruban · Adrian Weller · Volkan Cevher -
2021 Poster: Fitting summary statistics of neural data with a differentiable spiking network simulator »
Guillaume Bellec · Shuqi Wang · Alireza Modirshanechi · Johanni Brea · Wulfram Gerstner