Skip to yearly menu bar Skip to main content


Visual Cropping Improves Zero-Shot Question Answering of Multimodal Large Language Models

jiarui zhang ⋅ Mahyar Khayatkhoei ⋅ Prateek Chhikara ⋅ Filip Ilievski

Abstract

Video

Chat is not available.