Skip to yearly menu bar Skip to main content


Poster

Mitigating Covariate Shift in Behavioral Cloning via Robust Distribution Correction Estimation

Seokin Seo · Byung-Jun Lee · Jongmin Lee · HyeongJoo Hwang · Hongseok Yang · Kee-Eung Kim

West Ballroom A-D #6401
[ ]
Thu 12 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

We consider offline imitation learning (IL), which aims to train an agent to imitate from the dataset of expert demonstrations without online interaction with the environment. Behavioral Cloning (BC) has been a simple yet effective approach to offline IL, but it is also well-known to be vulnerable to the covariate shift resulting from the mismatch between the state distributions induced by the learned policy and the data collection policy. In this paper, to mitigate the effect of covariate shift in BC, we formulate a robust BC training objective, and employ a stationary distribution correction ratio estimation (DICE) to derive a feasible solution. We evaluate the effectiveness of our method through an extensive set of experiments covering diverse covariate shift scenarios. The results demonstrate the efficacy of the proposed approach in improving the robustness against the shifts, outperforming existing offline IL methods in such scenarios.

Live content is unavailable. Log in and register to view live content