firstbacksecondback
125 Results
Workshop
|
MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning Somnath Sendhil Kumar · Yash Gadhia · Tanuja Ganu · Akshay Nambi |
||
Poster
|
Wed 11:00 |
Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning Hao Shao · Shengju Qian · Han Xiao · Guanglu Song · ZHUOFAN ZONG · Letian Wang · Yu Liu · Hongsheng Li |
|
Poster
|
Fri 11:00 |
Multi-modal Situated Reasoning in 3D Scenes Xiongkun Linghu · Jiangyong Huang · Xuesong Niu · Xiaojian (Shawn) Ma · Baoxiong Jia · Siyuan Huang |
|
Poster
|
Thu 11:00 |
MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset Xin Shen · Heming Du · Hongwei Sheng · Shuyun Wang · Hui Chen · Huiqiang Chen · Zhuojie Wu · Xiaobiao Du · Jiaying Ying · Ruihan Lu · Qingzheng Xu · Xin Yu |
|
Poster
|
MMM-RS: A Multi-modal, Multi-GSD, Multi-scene Remote Sensing Dataset and Benchmark for Text-to-Image Generation jialin luo · Yuanzhi Wang · Ziqi Gu · Yide Qiu · Shuaizhen Yao · Fuyun Wang · Chunyan Xu · Wenhua Zhang · Dan Wang · Zhen Cui |
||
Competition
|
Sat 14:30 |
Invited talk: The Journey Towards Universal Perception: Experiments in Unsupervised, Multi-Task, Multi-Domain, Multi-Modal, and Multi-Channel Learning John R. Hershey |
|
Poster
|
Wed 11:00 |
M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating Interferometric SAR and Multispectral Data Matthew Allen · Francisco Dorr · Joseph Alejandro Gallego Mejia · Laura Martínez-Ferrer · Anna Jungbluth · Freddie Kalaitzis · Raul Ramos-Pollán |
|
Workshop
|
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind Haojun Shi · Suyu Ye · Xinyu Fang · Chuanyang Jin · Leyla Isik · Yen-Ling Kuo · Tianmin Shu |
||
Workshop
|
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind Haojun Shi · Suyu Ye · Xinyu Fang · Chuanyang Jin · Leyla Isik · Yen-Ling Kuo · Tianmin Shu |
||
Poster
|
Fri 11:00 |
Octopus: A Multi-modal LLM with Parallel Recognition and Sequential Understanding Chuyang Zhao · YuXin Song · Junru Chen · KANG RONG · Haocheng Feng · Gang Zhang · Shufan Ji · Jingdong Wang · Errui Ding · Yifan Sun |
|
Affinity Event
|
Multi-Modal Pipeline Defect Localization Mariam Manzoor · Zahra Arabi Narei · Henry Leung · Scott Miller |
||
Affinity Event
|
Rethinking Multi-Modal Tokenization for Reinforcement Learning with Transformers in Mobility-on-Demand Tasks Gabriel Schwartz · Raphael Camargo |