Skip to yearly menu bar Skip to main content


Poster

Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor

Keji He · Kehan Chen · Jiawang Bai · Yan Huang · Qi Wu · Shu-Tao Xia · Liang Wang

[ ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Vision-and-Language Navigation (VLN) requires an agent to dynamically explore environments following natural language.The VLN agent, closely integrated into daily lives, poses a substantial threat to the security of privacy and property upon the occurrence of malicious behavior.However, this serious issue has long been overlooked.In this paper, we pioneer the exploration of an object-aware backdoored VLN, achieved by implanting object-aware backdoors during the training phase. Tailored to the unique VLN nature of cross-modality and continuous decision-making, we propose a novel backdoored VLN paradigm: IPR Backdoor. This enables the agent to act in abnormal behavior once encountering the object triggers during language-guided navigation in unseen environments, thereby executing an attack on the target scene.Experiments demonstrate the effectiveness of our method in both physical and digital spaces across different VLN agents, as well as its robustness to various visual and textual variations. Additionally, our method also well ensures navigation performance in normal scenarios with remarkable stealthiness.

Live content is unavailable. Log in and register to view live content