Timezone: »
From the earliest years of our lives, humans use language to express our beliefs and desires. Being able to talk to artificial agents about our preferences would thus fulfill a central goal of value alignment. Yet today, we lack computational models explaining such language use. To address this challenge, we formalize learning from language in a contextual bandit setting and ask how a human might communicate preferences over behaviors. We study two distinct types of language: instructions, which provide information about the desired policy, and descriptions, which provide information about the reward function. We show that the agent's degree of autonomy determines which form of language is optimal: instructions are better in low-autonomy settings, but descriptions are better when the agent will need to act independently. We then define a pragmatic listener agent that robustly infers the speaker's reward function by reasoning about how the speaker expresses themselves. We validate our models with a behavioral experiment, demonstrating that (1) our speaker model predicts human behavior, and (2) our pragmatic listener successfully recovers humans' reward functions. Finally, we show that this form of social learning can integrate with and reduce regret in traditional reinforcement learning. We hope these insights facilitate a shift from developing agents that obey language to agents that learn from it.
Author Information
Theodore Sumers (Princeton University)
.jpg)
My research uses reinforcement learning and decision theory to study human communication. Theoretically, I'm interested in explaining how societies accumulate information over generations. Practically, I hope to develop artificial systems capable of interacting with and learning from humans.
Robert Hawkins (Princeton University)
Mark Ho (New York University)
Tom Griffiths (Princeton University)
Dylan Hadfield-Menell (MIT)
More from the Same Authors
-
2021 : Meta-learning inductive biases of learning systems with Gaussian processes »
Michael Li · Erin Grant · Tom Griffiths -
2022 : Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement »
Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang -
2022 : Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement »
Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang -
2022 : How to talk so AI will learn: instructions, descriptions, and pragmatics »
Theodore Sumers · Robert Hawkins · Mark Ho · Tom Griffiths · Dylan Hadfield-Menell -
2022 : Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement »
Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang -
2022 : Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement »
Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang -
2022 : On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning »
Dilip Arumugam · Mark Ho · Noah Goodman · Benjamin Van Roy -
2022 : Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement »
Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang -
2022 : Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement »
Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang -
2022 : Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement »
Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang -
2022 : Fast Adaptation via Human Diagnosis of Task Distribution Shift »
Andi Peng · Mark Ho · Aviv Netanyahu · Julie A Shah · Pulkit Agrawal -
2022 : Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks »
Stephen Casper · Kaivalya Hariharan · Dylan Hadfield-Menell -
2022 : On the informativeness of supervision signals »
Ilia Sucholutsky · Raja Marjieh · Tom Griffiths -
2022 : On the informativeness of supervision signals »
Ilia Sucholutsky · Raja Marjieh · Tom Griffiths -
2022 : Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement »
Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang -
2022 Workshop: Shared Visual Representations in Human and Machine Intelligence (SVRHM) »
Arturo Deza · Joshua Peterson · N Apurva Ratan Murty · Tom Griffiths -
2022 Poster: Using natural language and program abstractions to instill human inductive biases in machines »
Sreejan Kumar · Carlos G. Correa · Ishita Dasgupta · Raja Marjieh · Michael Y Hu · Robert Hawkins · Jonathan D Cohen · nathaniel daw · Karthik Narasimhan · Tom Griffiths -
2022 Poster: Robust Feature-Level Adversaries are Interpretability Tools »
Stephen Casper · Max Nadeau · Dylan Hadfield-Menell · Gabriel Kreiman -
2022 Poster: Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation »
Michael Chang · Tom Griffiths · Sergey Levine -
2021 : Reinforcement learning: It's all in the mind »
Tom Griffiths -
2021 Workshop: Workshop on Human and Machine Decisions »
Daniel Reichman · Joshua Peterson · Kiran Tomlinson · Annie Liang · Tom Griffiths -
2021 : Opening remarks »
Tom Griffiths -
2021 : The Right Words for the Job: Coordinating on Task-Relevant Conventions via Bayesian Program Learning »
Robert Hawkins -
2021 : Exploring the Structure of Human Adjective Representations »
Karan Grewal · Joshua Peterson · Bill Thompson · Tom Griffiths -
2021 : Invited Talk 4 »
Tom Griffiths -
2021 Workshop: Shared Visual Representations in Human and Machine Intelligence »
Arturo Deza · Joshua Peterson · N Apurva Ratan Murty · Tom Griffiths -
2021 Oral: Passive attention in artificial neural networks predicts human visual selectivity »
Thomas Langlois · Haicheng Zhao · Erin Grant · Ishita Dasgupta · Tom Griffiths · Nori Jacoby -
2021 Poster: Passive attention in artificial neural networks predicts human visual selectivity »
Thomas Langlois · Haicheng Zhao · Erin Grant · Ishita Dasgupta · Tom Griffiths · Nori Jacoby -
2020 Workshop: Shared Visual Representations in Human and Machine Intelligence (SVRHM) »
Arturo Deza · Joshua Peterson · N Apurva Ratan Murty · Tom Griffiths -
2019 : Concluding Remarks & Prizes Ceremony »
Arturo Deza · Joshua Peterson · Apurva Ratan Murty · Tom Griffiths -
2019 : Tom Griffiths »
Tom Griffiths -
2019 : Opening Remarks »
Arturo Deza · Joshua Peterson · Apurva Ratan Murty · Tom Griffiths -
2019 Workshop: Shared Visual Representations in Human and Machine Intelligence »
Arturo Deza · Joshua Peterson · Apurva Ratan Murty · Tom Griffiths -
2019 Poster: Reconciling meta-learning and continual learning with online mixtures of tasks »
Ghassen Jerfel · Erin Grant · Tom Griffiths · Katherine Heller -
2019 Spotlight: Reconciling meta-learning and continual learning with online mixtures of tasks »
Ghassen Jerfel · Erin Grant · Tom Griffiths · Katherine Heller -
2019 Poster: On the Utility of Learning about Humans for Human-AI Coordination »
Micah Carroll · Rohin Shah · Mark Ho · Tom Griffiths · Sanjit Seshia · Pieter Abbeel · Anca Dragan