Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

84 Results

<<   <   Page 7 of 7   >>   >
Workshop
Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
Rylan Schaeffer · Dan Valentine · Luke Bailey · James Chua · Zane Durante · Cristobal Eyzaguirre · Joe Benton · Brando Miranda · Henry Sleight · Tony Wang · John Hughes · Rajashree Agrawal · Mrinank Sharma · Scott Emmons · Sanmi Koyejo · Ethan Perez
Workshop
Sun 10:55 Contributed Talk 2: Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
Rylan Schaeffer · Dan Valentine · Luke Bailey · James Chua · Zane Durante · Cristobal Eyzaguirre · Joe Benton · Brando Miranda · Henry Sleight · Tony Wang · John Hughes · Rajashree Agrawal · Mrinank Sharma · Scott Emmons · Sanmi Koyejo · Ethan Perez
Poster
Wed 11:00 Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models
Kaican Li · Weiyan XIE · Yongxiang Huang · Didan Deng · Lanqing Hong · Zhenguo Li · Ricardo Silva · Nevin L. Zhang
Poster
Fri 11:00 Group Robust Preference Optimization in Reward-free RLHF
Shyam Sundhar Ramesh · Yifan Hu · Iason Chaimalas · Viraj Mehta · Pier Giuseppe Sessa · Haitham Bou Ammar · Ilija Bogunovic
Poster
Fri 11:00 JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Patrick Chao · Edoardo Debenedetti · Alexander Robey · Maksym Andriushchenko · Francesco Croce · Vikash Sehwag · Edgar Dobriban · Nicolas Flammarion · George J. Pappas · Florian Tramer · Hamed Hassani · Eric Wong
Workshop
Shh, don't say that! Domain Certification in LLMs
Cornelius Emde · Preetham Arvind · Alasdair Paren · Maxime Kayser · Thomas Rainforth · Thomas Lukasiewicz · Philip Torr · Adel Bibi
Workshop
Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini · Adel Javanmard · Murat Erdogdu
Workshop
Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage
Rafi Rashid · Jing Liu · Toshiaki Koike-Akino · Shagufta Mehnaz · Ye Wang
Workshop
Coordinated Robustness Evaluation Framework for Vision Language Models
Ashwin Ramesh Babu · Sajad Mousavi · Desik Rengarajan · Vineet Gundecha · Sahand Ghorbanpour · Avisek Naug · Antonio Guillen-Perez · Ricardo Luna Gutierrez · Soumyendu Sarkar
Workshop
Leveraging Periodicity for Robustness with Multi-modal Mood Pattern Models
Jaya Narain · Qinhua Sun · Oussama Elachqar · Haraldur Hallgrimsson · Feng Zhu · Shirley Ren
Workshop
Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning
Seanie Lee · Minsu Kim · Lynn Cherif · David Dobre · Juho Lee · Sung Ju Hwang · Kenji Kawaguchi · Gauthier Gidel · Yoshua Bengio · Nikolay Malkin · Moksh Jain
Workshop
Sat 10:45 When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
Rylan Schaeffer · Dan Valentine · Luke Bailey · James Chua · Cristobal Eyzaguirre · Zane Durante · Joe Benton · Brando Miranda · Henry Sleight · Tony Wang · John Hughes · Rajashree Agrawal · Mrinank Sharma · Scott Emmons · Sanmi Koyejo · Ethan Perez