Timezone: »

Counterfactual Vision-and-Language Navigation: Unravelling the Unseen
Amin Parvaneh · Ehsan Abbasnejad · Damien Teney · Javen Qinfeng Shi · Anton van den Hengel

Wed Dec 09 09:00 PM -- 11:00 PM (PST) @ Poster Session 4 #1232

The task of vision-and-language navigation (VLN) requires an agent to follow text instructions to find its way through simulated household environments. A prominent challenge is to train an agent capable of generalising to new environments at test time, rather than one that simply memorises trajectories and visual details observed during training. We propose a new learning strategy that learns both from observations and generated counterfactual environments. We describe an effective algorithm to generate counterfactual observations on the fly for VLN, as linear combinations of existing environments. Simultaneously, we encourage the agent's actions to remain stable between original and counterfactual environments through our novel training objective-effectively removing the spurious features that otherwise bias the agent. Our experiments show that this technique provides significant improvements in generalisation on benchmarks for Room-to-Room navigation and Embodied Question Answering.

Author Information

Amin Parvaneh (University of Adelaide)
Ehsan Abbasnejad (University of Adelaide)
Damien Teney (University of Adelaide)
Javen Qinfeng Shi (University of Adelaide)
Anton van den Hengel (University of Adelaide)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors