Skip to yearly menu bar Skip to main content


Tutorial

Reinforcement Learning for Embodied Cognition

Dana Ballard


Abstract:

The enormous progress in instrumentation for measuring brain states has made it possible to tackle the large issue of an overall model of brain computation. The intrinsic complexity of the brain can lead one to set aside issues related to its relationships with the body, but the field of Embodied Cognition stresses that understanding of brain function at the system level requires one to address the role of the brain-body interface. While it is obvious that the brain receives all its input through the senses and directs its outputs through the motor system, it has only recently been appreciated that the body interface performs huge amounts of computation that does not have to be repeated by the brain, and thus affords the brain great simplifications in its representations. In effect the brain's abstract states can explicitly or implicitly refer to coded representations of the world created by the body.

Even if the brain can communicate with the world through abstractions, the severe speed limitations in its neural circuitry means that vast amounts of indexing must be performed during development so that appropriate behavioral responses can be rapidly accessed. One way this could happen would be if the brain used some kind of decomposition whereby behavioral primitives could be quickly accessed and combined. Such a factorization has huge synergies with embodied cognition models, which can use the natural filtering imposed by the body in directing behavior to select relevant primitives. These advantages can be explored with virtual environments replete with humanoid avatars. Such settings allow the manipulation of experimental parameters in systematic ways. Our test settings are those of everyday natural settings such as walking and driving in a small town, and sandwich making and looking for lost items in an apartment.

The issues we focus on center around the programming of the individual behavioral primitives using reinforcement learning (RL). Central issues are eye fixation programming, credit assignment to individual behavioral modules, and learning the value of behaviors via inverse reinforcement learning.

Eye fixations are the central information gathering method used by humans, yet the protocols for programming them are still unsettled. We show that information gain in an RL setting can potentially explain experimental data.

Credit assignment. If behaviors are to be decomposed into individual modules, then dividing up received reward amongst them becomes a major issue. We show that Bayesian estimation techniques, used in the RL setting, resolve this issue efficiently.

Inverse Reinforcement Learning. One way to learn new behaviors would be if a human agent could imitate them and learn their value. We show that an efficient algorithm developed by Rothkopf can estimate value of behaviors from observed data using Bayesian RL techniques.

Chat is not available.