firstbacksecondback
18 Results
Poster
|
Fri 16:30 |
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs Sukmin Yun · haokun lin · Rusiru Thushara · Mohammad Bhat · Yongxin Wang · zutao jiang · Mingkai Deng · Jinhong Wang · Tianhua Tao · Junbo Li · Haonan Li · Preslav Nakov · Timothy Baldwin · Zhengzhong Liu · Eric Xing · Xiaodan Liang · Zhiqiang Shen |
|
Poster
|
CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated Responses Jing Yao · Xiaoyuan Yi · Xing Xie |
||
Workshop
|
ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal |
||
Workshop
|
Sat 15:45 |
A Framework for Evaluating LLMs Under Task Indeterminacy Luke Guerdan · Hanna Wallach · Solon Barocas · Alexandra Chouldechova |
|
Workshop
|
A Framework for Evaluating LLMs Under Task Indeterminacy Luke Guerdan · Hanna Wallach · Solon Barocas · Alexandra Chouldechova |
||
Workshop
|
Sat 15:45 |
ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal |
|
Workshop
|
Evaluating Refusal Shira Abramovich · Anna J. Ma |
||
Poster
|
Wed 11:00 |
Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context Jingru (Jessica) Jia · Zehua Yuan · Junhao Pan · Paul McNamara · Deming Chen |
|
Workshop
|
GenAI Evaluation Maturity Framework (GEMF) to assess and improve GenAI Evaluations Yilin Zhang · Frank J. Kanayet |
||
Poster
|
Thu 11:00 |
Bias and Volatility: A Statistical Framework for Evaluating Large Language Model's Stereotypes and the Associated Generation Inconsistency Yiran Liu · Ke Yang · Zehan Qi · Xiao Liu · Yang Yu · Cheng Xiang Zhai |
|
Workshop
|
Sat 15:45 |
A shared standard for valid measurement of generative AI systems' capabilities, risks, and impacts Alexandra Chouldechova · Chad Atalla · Solon Barocas · A. Feder Cooper · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Nicholas Pangakis · Stefanie Reed · Emily Sheng · Dan Vann · Matthew Vogel · Hannah Washington · Hanna Wallach |
|
Workshop
|
Coordinated Robustness Evaluation Framework for Vision Language Models Ashwin Ramesh Babu · Sajad Mousavi · Desik Rengarajan · Vineet Gundecha · Sahand Ghorbanpour · Avisek Naug · Antonio Guillen-Perez · Ricardo Luna Gutierrez · Soumyendu Sarkar |