Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Deep Reinforcement Learning

Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning

Byungchan Ko · Jungseul Ok


Abstract:

We consider data augmentation technique to improve data efficiency and generalization performance in reinforcement learning (RL). Our empirical study on Open AI Procgen shows that the timing of when applying augmentation is critical, and to maximize test performance, an augmentation needs to be applied either during the entire RL training, or after the end of RL training. More specifically, if the regularization imposed by augmentation is helpful only in testing, it is better to procrastinate the augmentation after training than to use it during training in terms of sample and computation complexity since such an augmentation often disturbs the training process. Conversely, an augmentation providing regularization useful in training needs to be used during the whole training period to fully utilize its benefit in terms of not only generalization but also data efficiency. Based on our findings, we propose a mechanism to fully exploit a set of augmentations, which identifies an augmentation (including no augmentation) to maximize RL training performance, and then utilizes all the augmentations by network distillation to maximize test performance. Our experiment empirically justifies the proposed method compared to other automatic augmentation mechanism.

Chat is not available.