Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

47 Results

<<   <   Page 2 of 4   >   >>
Workshop
Sat 12:00 ICScore: Metrics for Evaluating Interestingness and Creativity of Stories
Junha Lee · Jaeshin Cho · Youngjin Cho · Hyewon Jin · Hyemin Lee · Min Song
Workshop
Multilingual Hallucination Gaps in Large Language Models
Cléa Chataigner · Afaf Taik · Golnoosh Farnadi
Workshop
Sat 15:45 Statistical Uncertainty Quantification for Aggregate Performance Metrics in Machine Learning Benchmarks
Rachel Longjohn · Giri Gopalan · Emily Casleton
Poster
Fri 16:30 Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)
Michael Saxon · Fatima Jahara · Mahsa Khoshnoodi · Yujie Lu · Aditya Sharma · William Yang Wang
Workshop
Sat 9:00 Algorithmic Fairness through the lens of Metrics and Evaluation
Awa Dieng · Miriam Rateike · Jamelle Watson-Daniels · Golnoosh Farnadi · Nando Fioretto
Workshop
Benchmark to Audit LLM Generated Clinical Notes for Disparities Arising from Biases and Stereotypes
Hongyu Cai · Swetasudha Panda · Naveen Jafer Nizar · Qinlan Shen · Daeja Oxendine · Sumana Srivatsa · Krishnaram Kenthapadi
Workshop
Sat 17:27 Benchmark to Audit LLM Generated Clinical Notes for Disparities Arising from Biases and Stereotypes
Hongyu Cai · Swetasudha Panda · Naveen Jafer Nizar · Qinlan Shen · Daeja Oxendine · Sumana Srivatsa · Krishnaram Kenthapadi
Workshop
Evaluating Gender Bias Transfer between Pre-trained and Prompt Adapted Language Models
Nivedha Sivakumar · Natalie Mackraz · Samira Khorshidi · Krishna Patel · Barry-John Theobald · Luca Zappella · Nicholas Apostoloff
Poster
Fri 16:30 Metric Space Magnitude for Evaluating the Diversity of Latent Representations
Katharina Limbeck · Rayna Andreeva · Rik Sarkar · Bastian Rieck
Workshop
Sat 17:27 Better Bias Benchmarking of Language Models via Multi-factor Analysis
Hannah Powers · Ioana Baldini · Dennis Wei · Kristin P Bennett
Workshop
Better Bias Benchmarking of Language Models via Multi-factor Analysis
Hannah Powers · Ioana Baldini · Dennis Wei · Kristin P Bennett
Workshop
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Khaoula Chehbouni · Jonathan Colaço Carr · Yash More · Jackie CK Cheung · Golnoosh Farnadi