Skip to yearly menu bar Skip to main content


Language Models Resist Alignment

Jiaming Ji ⋅ Kaile Wang ⋅ Tianyi (Alex) Qiu ⋅ Boyuan Chen ⋅ Changye Li ⋅ Hantao Lou ⋅ Jiayi Zhou ⋅ Juntao Dai ⋅ Yaodong Yang

Abstract

Chat is not available.