firstbacksecondback
359 Results
Poster
|
Wed 16:30 |
Multimodal Large Language Models Make Text-to-Image Generative Models Align Better Xun Wu · Shaohan Huang · Guolong Wang · Jing Xiong · Furu Wei |
|
Workshop
|
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems Patrick Emami · Zhaonan Li · Saumya Sinha · Truc Nguyen |
||
Workshop
|
Sat 10:30 |
Efficient Generative Multimodal Integration (EGMI): Enabling Audio Generation from Text-Image Pairs through Alignment with Large Language Models Taemin Kim · Wooyeol Baek · Heeseok Oh |
|
Poster
|
Wed 16:30 |
Contrasting with Symile: Simple Model-Agnostic Representation Learning for Unlimited Modalities Adriel Saporta · Aahlad Manas Puli · Mark Goldstein · Rajesh Ranganath |
|
Poster
|
Fri 16:30 |
GenRL: Multimodal-foundation world models for generalization in embodied agents Pietro Mazzaglia · Tim Verbelen · Bart Dhoedt · Aaron Courville · Sai Rajeswar Mudumba |
|
Workshop
|
Language Models for Text-guided Protein Evolution Zhanghan Ni · Shengchao Liu · Animashree Anandkumar |
||
Workshop
|
RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction Yuwei Zhang · Tong Xia · Aaqib Saeed · Cecilia Mascolo |
||
Workshop
|
Large Language Models Still Exhibit Bias in Long Text Wonje Jeung · Dongjae Jeon · Ashkan Yousefpour · Jonghyun Choi |
||
Poster
|
Fri 11:00 |
TOPA: Extending Large Language Models for Video Understanding via Text-Only Pre-Alignment Wei Li · Hehe Fan · Yongkang Wong · Mohan Kankanhalli · Yi Yang |
|
Poster
|
Wed 11:00 |
HAWK: Learning to Understand Open-World Video Anomalies Jiaqi Tang · Hao LU · RUIZHENG WU · Xiaogang Xu · Ke Ma · Cheng Fang · Bin Guo · Jiangbo Lu · Qifeng Chen · Yingcong Chen |
|
Poster
|
Wed 16:30 |
A Practitioner's Guide to Real-World Continual Multimodal Pretraining Vishaal Udandarao · Karsten Roth · Sebastian Dziadzio · Ameya Prabhu · Mehdi Cherti · Oriol Vinyals · Olivier Henaff · Samuel Albanie · Zeynep Akata · Matthias Bethge |
|
Poster
|
Thu 11:00 |
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning Brandon Huang · Chancharik Mitra · Leonid Karlinsky · Assaf Arbelle · Trevor Darrell · Roei Herzig |