Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

18 Results

<<   <   Page 1 of 2   >   >>
Poster
Fri 16:30 Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Sukmin Yun · haokun lin · Rusiru Thushara · Mohammad Bhat · Yongxin Wang · zutao jiang · Mingkai Deng · Jinhong Wang · Tianhua Tao · Junbo Li · Haonan Li · Preslav Nakov · Timothy Baldwin · Zhengzhong Liu · Eric Xing · Xiaodan Liang · Zhiqiang Shen
Poster
CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated Responses
Jing Yao · Xiaoyuan Yi · Xing Xie
Workshop
ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents
Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal
Workshop
Sat 15:45 A Framework for Evaluating LLMs Under Task Indeterminacy
Luke Guerdan · Hanna Wallach · Solon Barocas · Alexandra Chouldechova
Workshop
A Framework for Evaluating LLMs Under Task Indeterminacy
Luke Guerdan · Hanna Wallach · Solon Barocas · Alexandra Chouldechova
Workshop
Sat 15:45 ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents
Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal
Workshop
Evaluating Refusal
Shira Abramovich · Anna J. Ma
Poster
Wed 11:00 Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context
Jingru (Jessica) Jia · Zehua Yuan · Junhao Pan · Paul McNamara · Deming Chen
Workshop
GenAI Evaluation Maturity Framework (GEMF) to assess and improve GenAI Evaluations
Yilin Zhang · Frank J. Kanayet
Poster
Thu 11:00 Bias and Volatility: A Statistical Framework for Evaluating Large Language Model's Stereotypes and the Associated Generation Inconsistency
Yiran Liu · Ke Yang · Zehan Qi · Xiao Liu · Yang Yu · Cheng Xiang Zhai
Workshop
Sat 15:45 A shared standard for valid measurement of generative AI systems' capabilities, risks, and impacts
Alexandra Chouldechova · Chad Atalla · Solon Barocas · A. Feder Cooper · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Nicholas Pangakis · Stefanie Reed · Emily Sheng · Dan Vann · Matthew Vogel · Hannah Washington · Hanna Wallach
Workshop
Coordinated Robustness Evaluation Framework for Vision Language Models
Ashwin Ramesh Babu · Sajad Mousavi · Desik Rengarajan · Vineet Gundecha · Sahand Ghorbanpour · Avisek Naug · Antonio Guillen-Perez · Ricardo Luna Gutierrez · Soumyendu Sarkar