Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

17 Results

<<   <   Page 2 of 2   >>   >
Workshop
TRIAGE: Ethical Benchmarking of AI Models Through Mass Casualty Simulations
Nathalie Kirch · Konstantin Hebenstreit · Matthias Samwald
Workshop
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
Giulio Zizzo · Giandomenico Cornacchia · Kieran Fraser · Muhammad Zaid Hameed · Ambrish Rawat · Beat Buesser · Mark Purcell · Pin-Yu Chen · Prasanna Sattigeri · Kush Varshney
Workshop
Decoding Biases: An Analysis of Automated Methods and Metrics for Gender Bias Detection in Language Models
Shachi H. Kumar · Saurav Sahay · Sahisnu Mazumder · Eda Okur · Ramesh Manuvinakurike · Nicole Beckage · Hsuan Su · Hung-yi Lee · Lama Nachman
Workshop
Lexically-constrained automated prompt augmentation: A case study using adversarial T2I data
Jessica Quaye · Alicia Parrish · Oana Inel · Minsuk Kahng · Charvi Rastogi · Hannah Rose Kirk · Jess Tsang · Nathan Clement · Rafael Mosquera-Gomez · Juan Ciro · Vijay Janapa Reddi · Lora Aroyo
Poster
Wed 11:00 Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Mikayel Samvelyan · Sharath Chandra Raparthy · Andrei Lupu · Eric Hambro · Aram Markosyan · Manish Bhatt · Yuning Mao · Minqi Jiang · Jack Parker-Holder · Jakob Foerster · Tim Rocktäschel · Roberta Raileanu