Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

21 Results

<<   <   Page 1 of 2   >   >>
Workshop
Sat 12:00 H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models
Nhi Pham · Michael Schott
Poster
Wed 16:30 ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models
Shuo Liu · Kaining Ying · Hao Zhang · yue yang · Yuqi Lin · Tianle Zhang · Chuanhao Li · Yu Qiao · Ping Luo · Wenqi Shao · Kaipeng Zhang
Poster
Fri 11:00 ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation
jingnan zheng · Han Wang · An Zhang · Nguyen Duy Tai · Jun Sun · Tat-Seng Chua
Workshop
ASTRID - An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
Mohita Chowdhury · Yajie He · Ernest Lim · Aisling Higham
Workshop
AIR-Bench 2024: Safety Evaluation Based on Risk Categories from Regulations and Policies
Kevin Klyman
Workshop
ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents
Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal
Workshop
Sat 15:45 ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents
Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal
Workshop
Examining Distribution-based Amortized Fair Ranking
Aparna Balagopalan · Kai Wang · Asia Biega · Marzyeh Ghassemi
Workshop
What's in a Query: Examining Distribution-based Amortized Fair Ranking
Aparna Balagopalan · Kai Wang · Asia Biega · Marzyeh Ghassemi
Workshop
DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students’ Hand-Drawn Math Images
Sami Baral · Li Lucy · Ryan Knight · Alice Ng · Luca Soldaini · Neil Heffernan · Kyle Lo
Workshop
RelWire: Metric Based Rewiring
Rishi Sonthalia · Anna Gilbert · Matthew Durham
Poster
Wed 11:00 Evaluating the design space of diffusion-based generative models
Yuqing Wang · Ye He · Molei Tao