Timezone: »

Blending Autonomous Exploration and Apprenticeship Learning
Thomas Walsh · Daniel K Hewlett · Clayton T Morrison

Tue Dec 13 08:45 AM -- 02:59 PM (PST) @

We present theoretical and empirical results for a framework that combines the benefits of apprenticeship and autonomous reinforcement learning. Our approach modifies an existing apprenticeship learning framework that relies on teacher demonstrations and does not necessarily explore the environment. The first change is replacing previously used Mistake Bound model learners with a recently proposed framework that melds the KWIK and Mistake Bound supervised learning protocols. The second change is introducing a communication of expected utility from the student to the teacher. The resulting system only uses teacher traces when the agent needs to learn concepts it cannot efficiently learn on its own.

Author Information

Thomas Walsh (Sony AI)
Daniel K Hewlett (Google)
Clayton T Morrison (University of Arizona)

More from the Same Authors