Timezone: »

 
BASALT: A MineRL Competition on Solving Human-Judged Task + Q&A
Rohin Shah · Cody Wild · Steven Wang · Neel Alex · Brandon Houghton · William Guss · Sharada Mohanty · Stephanie Milani · Nicholay Topin · Pieter Abbeel · Stuart Russell · Anca Dragan

Thu Dec 09 11:25 AM -- 11:45 AM (PST) @
Event URL: https://minerl.io/basalt/ »

The Benchmark for Agents that Solve Almost-Lifelike Tasks (BASALT) competition aims to promote research in the area of learning from human feedback in order to enable agents that can pursue tasks that do not have crisp, easily defined reward functions. We provide tasks consisting of a simple English language description alongside a Gym environment, without any associated reward function, but with expert demos. Participants will train agents for these tasks using their preferred methods. We expect typical solutions will use imitation learning, or learning from comparisons. Submitted agents will be evaluated based on how well they complete the tasks, as judged by humans given the same description of the tasks.

Author Information

Rohin Shah (DeepMind)

Rohin is a Research Scientist on the technical AGI safety team at DeepMind. He completed his PhD at the Center for Human-Compatible AI at UC Berkeley, where he worked on building AI systems that can learn to assist a human user, even if they don't initially know what the user wants. He is particularly interested in big picture questions about artificial intelligence. What techniques will we use to build human-level AI systems? How will their deployment affect the world? What can we do to make this deployment go better? He writes up summaries and thoughts about recent work tackling these questions in the Alignment Newsletter.

Cody Wild (Google Research)
Steven Wang (UC Berkeley)
Neel Alex (University of Cambridge)
Brandon Houghton (OpenAI)
William Guss (Carnegie Mellon University)
Sharada Mohanty (AIcrowd SA)
Stephanie Milani (Carnegie Mellon University)
Nicholay Topin (Carnegie Mellon University)
Pieter Abbeel (UC Berkeley & Covariant)

Pieter Abbeel is Professor and Director of the Robot Learning Lab at UC Berkeley [2008- ], Co-Director of the Berkeley AI Research (BAIR) Lab, Co-Founder of covariant.ai [2017- ], Co-Founder of Gradescope [2014- ], Advisor to OpenAI, Founding Faculty Partner AI@TheHouse venture fund, Advisor to many AI/Robotics start-ups. He works in machine learning and robotics. In particular his research focuses on making robots learn from people (apprenticeship learning), how to make robots learn through their own trial and error (reinforcement learning), and how to speed up skill acquisition through learning-to-learn (meta-learning). His robots have learned advanced helicopter aerobatics, knot-tying, basic assembly, organizing laundry, locomotion, and vision-based robotic manipulation. He has won numerous awards, including best paper awards at ICML, NIPS and ICRA, early career awards from NSF, Darpa, ONR, AFOSR, Sloan, TR35, IEEE, and the Presidential Early Career Award for Scientists and Engineers (PECASE). Pieter's work is frequently featured in the popular press, including New York Times, BBC, Bloomberg, Wall Street Journal, Wired, Forbes, Tech Review, NPR.

Stuart Russell (UC Berkeley)
Anca Dragan (UC Berkeley)

More from the Same Authors