firstbacksecondback
530 Results
Workshop
|
CanadaFire2023: Burned Area Mapping Datasets and Benchmarks for Canadian Wildfires in 2023 Zilong Zhong · Alemu Gonsamo |
||
Workshop
|
Sun 14:15 |
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions Mohammadmostafa Rostamkhani · Baktash Ansariogholbake · Hoorieh Sabzevari · Farzan Rahmani · Sauleh Eetemadi |
|
Workshop
|
Sun 11:50 |
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions Mohammadmostafa Rostamkhani · Baktash Ansariogholbake · Hoorieh Sabzevari · Farzan Rahmani · Sauleh Eetemadi |
|
Workshop
|
ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance & Efficiency on a Specific Domain Ali Shiraee Kasmaee · Mohammad Khodadad · Mohammad Arshi Saloot · Nick Sherck · Stephen Dokas · Hamidreza Mahyar · Soheila Samiee |
||
Workshop
|
AtmosArena: Benchmarking Foundation Models for Atmospheric Sciences Tung Nguyen · Prateik Sinha · Advit Deepak · Karen A McKinnon · Aditya Grover |
||
Workshop
|
AtmosArena: Benchmarking Foundation Models for Atmospheric Sciences Tung Nguyen · Prateik Sinha · Advit Deepak · Karen A McKinnon · Aditya Grover |
||
Workshop
|
I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench Yuan Li · Yue Huang · Yuli Lin · Siyuan Wu · Yao Wan · Lichao Sun |
||
Workshop
|
miniCodeProps: a Minimal Benchmark for Proving Code Properties Evan Lohn · Sean Welleck |
||
Workshop
|
CantorNet: A Sandbox for Testing Topological and Geometrical Measures Michal Lewandowski · Hamid Eghbalzadeh · Bernhard A. Moser |
||
Workshop
|
SharedContextBench: How Lossy are Long-context Methods in KV Cache Reuse Yucheng LI · Huiqiang Jiang · Qianhui Wu · Xufang Luo · Surin Ahn · Chengruidong Zhang · Amir Abdi · Dongsheng Li · Jianfeng Gao · Yuqing Yang · Lili Qiu |
||
Poster
|
Wed 11:00 |
HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Rhea Sukthanker · Arber Zela · Benedikt Staffler · Aaron Klein · Lennart Purucker · Jörg Franke · Frank Hutter |
|
Workshop
|
Benchmark to Audit LLM Generated Clinical Notes for Disparities Arising from Biases and Stereotypes Hongyu Cai · Swetasudha Panda · Naveen Jafer Nizar · Qinlan Shen · Daeja Oxendine · Sumana Srivatsa · Krishnaram Kenthapadi |