Skip to yearly menu bar Skip to main content


SkewAct: Red Teaming Large Language Models via Activation-Skewed Adversarial Prompt Optimization

Hanxi Guo ⋅ Siyuan Cheng ⋅ Guanhong Tao ⋅ Guangyu Shen ⋅ Zhuo Zhang ⋅ Shengwei An ⋅ Kaiyuan Zhang ⋅ Xiangyu Zhang

Abstract

Chat is not available.