Timezone: »

Learning Machines can Curl - Adaptive Deep Reinforcement Learning enables the robot Curly to win against human players in an icy world
Dong-Ok Won · Sang-Hoon Lee · Klaus-Robert Müller · Seong-Whan Lee

Tue Dec 10 05:30 PM -- 07:30 PM (PST) @ East Exhibition Hall B + C

Most artificial intelligence (AI) based learning systems act in virtual or laboratory environments. Recently, deep reinforcement learning (DRL) has even enabled real world applications such as robotics, e.g., walk and arm control. Here we teach a robot to succeed in curling (Olympic discipline), which is a highly complex real-world application where a robot needs to carefully learn to play the game on the slippery ice sheet in order to compete well against human opponents. This scenario encompasses fundamental challenges: uncertainty, nonstationarity, infinite state spaces and most importantly scarce data. To succeed, we adapted standard Deep Reinforcement Learning to cope with these challenges. Specifically we use a physics simulation for pretraining, then adapt and correct with real world scarce data. Notably, a policy that was originally learned from simulation data typically causes erroneous actions in the real world, in particular, the uncertainty experienced in real-world applications is likely to disturb a DRL system, e.g., to perform erroneous actions. One fundamental objective of this study is thus to better understand and model the transfer from simulation to real-world scenarios with uncertainty. Note that the curling ice sheet is an environment with highly varying uncertainty that has a profound effect on the throw performance; humans require years of practice to master the game that has complex strategic elements, as well as the throw itself. This nonstationarity and infinite state spaces require novel technical contributions. We demonstrate our proposed framework and show videos, experiments and statistics about Curly our AI curling robot being tested on a real curling ice sheet. Curly performed well both, in classical game situations and when interacting with human opponents; e.g., the top-ranked Korean amateur high school curling team and top-ranked human opponents (i.e., top-ranked women curling team and Korea national wheelchair curling team).

Author Information

Dong-Ok Won (Korea University)
Sang-Hoon Lee (Korea University)
Klaus-Robert Müller (TU Berlin)
Seong-Whan Lee (Korea University)

More from the Same Authors