Timezone: »

Towards Generalization and Simplicity in Continuous Control
Aravind Rajeswaran · Kendall Lowrey · Emanuel Todorov · Sham Kakade

Mon Dec 04 06:30 PM -- 10:30 PM (PST) @ Pacific Ballroom #202

The remarkable successes of deep learning in speech recognition and computer vision have motivated efforts to adapt similar techniques to other problem domains, including reinforcement learning (RL). Consequently, RL methods have produced rich motor behaviors on simulated robot tasks, with their success largely attributed to the use of multi-layer neural networks. This work is among the first to carefully study what might be responsible for these recent advancements. Our main result calls this emerging narrative into question by showing that much simpler architectures -- based on linear and RBF parameterizations -- achieve comparable performance to state of the art results. We not only study different policy representations with regard to performance measures at hand, but also towards robustness to external perturbations. We again find that the learned neural network policies --- under the standard training scenarios --- are no more robust than linear (or RBF) policies; in fact, all three are remarkably brittle. Finally, we then directly modify the training scenarios in order to favor more robust policies, and we again do not find a compelling case to favor multi-layer architectures. Overall, this study suggests that multi-layer architectures should not be the default choice, unless a side-by-side comparison to simpler architectures shows otherwise. More generally, we hope that these results lead to more interest in carefully studying the architectural choices, and associated trade-offs, for training generalizable and robust policies.

Author Information

Aravind Rajeswaran (University of Washington)
Kendall Lowrey (University of Washington)
Emanuel Todorov (University of Washington)
Sham Kakade (University of Washington)

More from the Same Authors