Workshop
|
|
ANav: Action-Aware Zero-Shot Robot Navigation Using Vision-Language Ability of Foundation Models
Peihao Chen · Xinyu Sun · Hongyan Zhi · Runhao Zeng · Thomas Li · Mingkui Tan · Chuang Gan
|
|
Workshop
|
|
Vision-and-Language Navigation in Real World using Foundation Models
Chengguang Xu · Hieu T. Nguyen · Christopher Amato · Lawson Wong
|
|
Workshop
|
|
Vision-and-Language Navigation in Real World using Foundation Models
Chengguang Xu · Hieu T. Nguyen · Christopher Amato · Lawson Wong
|
|
Poster
|
Wed 15:00
|
PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation
Jialu Li · Mohit Bansal
|
|
Poster
|
Thu 8:45
|
Frequency-Enhanced Data Augmentation for Vision-and-Language Navigation
Keji He · Chenyang Si · Zhihe Lu · Yan Huang · Liang Wang · Xinchao Wang
|
|
Poster
|
Thu 8:45
|
Are Diffusion Models Vision-And-Language Reasoners?
Benno Krojer · Elinor Poole-Dayan · Vikram Voleti · Chris Pal · Siva Reddy
|
|
Poster
|
Tue 8:45
|
Kiki or Bouba? Sound Symbolism in Vision-and-Language Models
Morris Alper · Hadar Averbuch-Elor
|
|
Poster
|
Tue 8:45
|
VisIT-Bench: A Dynamic Benchmark for Evaluating Instruction-Following Vision-and-Language Models
Yonatan Bitton · Hritik Bansal · Jack Hessel · Rulin Shao · Wanrong Zhu · Anas Awadalla · Josh Gardner · Rohan Taori · Ludwig Schmidt
|
|
Workshop
|
Sat 8:35
|
Compositional Generalization in Vision-Language Models uses the Language Modality only
|
|
Workshop
|
|
Analyzing Zero-Shot Abilities of Vision-Language Models on Video Understanding Tasks
Avinash Madasu · Anahita Bhiwandiwalla · VASUDEV LAL
|
|
Workshop
|
|
Learning Inner Monologue and Its Utilization in Vision-Language Challenges
Diji Yang · Kezhen Chen · Jinmeng Rao · Xiaoyuan Guo · Yawen Zhang · Jie Yang · Yi Zhang
|
|
Workshop
|
|
Selective Prediction For Open-Ended Question Answering in Black-Box Vision-Language Models
Zaid Khan · Yun Fu
|
|