firstbacksecondback
47 Results
Workshop
|
Sat 12:00 |
ICScore: Metrics for Evaluating Interestingness and Creativity of Stories Junha Lee · Jaeshin Cho · Youngjin Cho · Hyewon Jin · Hyemin Lee · Min Song |
|
Workshop
|
Multilingual Hallucination Gaps in Large Language Models Cléa Chataigner · Afaf Taik · Golnoosh Farnadi |
||
Workshop
|
Sat 15:45 |
Statistical Uncertainty Quantification for Aggregate Performance Metrics in Machine Learning Benchmarks Rachel Longjohn · Giri Gopalan · Emily Casleton |
|
Poster
|
Fri 16:30 |
Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2) Michael Saxon · Fatima Jahara · Mahsa Khoshnoodi · Yujie Lu · Aditya Sharma · William Yang Wang |
|
Workshop
|
Sat 9:00 |
Algorithmic Fairness through the lens of Metrics and Evaluation Awa Dieng · Miriam Rateike · Jamelle Watson-Daniels · Golnoosh Farnadi · Nando Fioretto |
|
Workshop
|
Benchmark to Audit LLM Generated Clinical Notes for Disparities Arising from Biases and Stereotypes Hongyu Cai · Swetasudha Panda · Naveen Jafer Nizar · Qinlan Shen · Daeja Oxendine · Sumana Srivatsa · Krishnaram Kenthapadi |
||
Workshop
|
Sat 17:27 |
Benchmark to Audit LLM Generated Clinical Notes for Disparities Arising from Biases and Stereotypes Hongyu Cai · Swetasudha Panda · Naveen Jafer Nizar · Qinlan Shen · Daeja Oxendine · Sumana Srivatsa · Krishnaram Kenthapadi |
|
Workshop
|
Evaluating Gender Bias Transfer between Pre-trained and Prompt Adapted Language Models Nivedha Sivakumar · Natalie Mackraz · Samira Khorshidi · Krishna Patel · Barry-John Theobald · Luca Zappella · Nicholas Apostoloff |
||
Poster
|
Fri 16:30 |
Metric Space Magnitude for Evaluating the Diversity of Latent Representations Katharina Limbeck · Rayna Andreeva · Rik Sarkar · Bastian Rieck |
|
Workshop
|
Sat 17:27 |
Better Bias Benchmarking of Language Models via Multi-factor Analysis Hannah Powers · Ioana Baldini · Dennis Wei · Kristin P Bennett |
|
Workshop
|
Better Bias Benchmarking of Language Models via Multi-factor Analysis Hannah Powers · Ioana Baldini · Dennis Wei · Kristin P Bennett |
||
Workshop
|
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset Khaoula Chehbouni · Jonathan Colaço Carr · Yash More · Jackie CK Cheung · Golnoosh Farnadi |