Invited Talk
Workshop: InterNLP: Workshop on Interactive Learning for Natural Language Processing

Anca Dragan: Learning human preferences from language

Anca Dragan


In classic instruction following, language like "I'd like the JetBlue flight" maps to actions (e.g., selecting that flight). However, language also conveys information about a user's underlying reward function (e.g., a general preference for JetBlue), which can allow a model to carry out desirable actions in new contexts. In this talk, I'll share a model that infers rewards from language pragmatically: reasoning about how speakers choose utterances not only to elicit desired actions, but also to reveal information about their preferences.

Chat is not available.