Workshop
|
|
Evaluating Superhuman Models with Consistency Checks
Lukas Fluri · Daniel Paleka · Florian Tramer
|
|
Workshop
|
|
Learning Models and Evaluating Policies with Offline Off-Policy Data under Partial Observability
Shreyas Chaudhari · Philip Thomas · Bruno C. da Silva
|
|
Workshop
|
|
Evaluating task specific finetuning for protein language models
Robert Schmirler
|
|
Poster
|
Tue 15:15
|
High Precision Causal Model Evaluation with Conditional Randomization
Chao Ma · Cheng Zhang
|
|
Workshop
|
Sat 8:50
|
Self-Evaluation Improves Selective Generation in Large Language Models
Jie Ren · Yao Zhao · Tu Vu · Peter Liu · Balaji Lakshminarayanan
|
|
Workshop
|
|
ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
Maitreya Patel · Tejas Gokhale · Chitta Baral · 'YZ' Yezhou Yang
|
|
Workshop
|
|
Preparation Of Labeled Cryo-ET Datasets For Training And Evaluation Of Machine Learning Models
Aygul Ishemgulova · Alex J. Noble · Tristan Bepler · Alex De Marco
|
|
Affinity Workshop
|
|
Evaluating zero-shot image classification based on visual language model with relation to background shift
Flávio Santos · Maynara Souza · Cleber Zanchettin
|
|
Workshop
|
|
On Incorporating new Variables during Evaluation
Harsimran Bhasin · Soumyadeep Ghosh
|
|
Workshop
|
|
Zero-shot Conversational Summarization Evaluations with small Large Language Models
Ramesh Manuvinakurike · Saurav Sahay · Sangeeta Manepalli · Lama Nachman
|
|
Workshop
|
|
Evaluating the Utility of Model Explanations for Model Development
Shawn Im · Jacob Andreas · Yilun Zhou
|
|
Poster
|
Wed 8:45
|
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Yujie Lu · Xianjun Yang · Xiujun Li · Xin Eric Wang · William Yang Wang
|
|