Skip to yearly menu bar Skip to main content


Search All 2023 Events
 

4 Results

<<   <   Page 1 of 1   >>   >
Workshop
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Sam Toyer · Olivia Watkins · Ethan Mendes · Justin Svegliato · Luke Bailey · Tiffany Wang · Isaac Ong · Karim Elmaaroufi · Pieter Abbeel · Trevor Darrell · Alan Ritter · Stuart J Russell
Workshop
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Sam Toyer · Olivia Watkins · Ethan Mendes · Justin Svegliato · Luke Bailey · Tiffany Wang · Isaac Ong · Karim Elmaaroufi · Pieter Abbeel · Trevor Darrell · Alan Ritter · Stuart J Russell
Workshop
Fri 13:45 Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection
Jun Yan · Vikas Yadav · Shiyang Li · Lichang Chen · Zheng Tang · Hai Wang · Vijay Srinivasan · Xiang Ren · Hongxia Jin
Workshop
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Sam Toyer · Olivia Watkins · Ethan Mendes · Justin Svegliato · Luke Bailey · Tiffany Wang · Isaac Ong · Karim Elmaaroufi · Pieter Abbeel · Trevor Darrell · Alan Ritter · Stuart J Russell