Workshop
|
|
Can sparse autoencoders be used to decompose and interpret steering vectors?
Harry Mayne · Yushi Yang · Adam Mahdi
|
|
Workshop
|
|
Overcoming Limitations of Steering Vectors with Low-Rank Representation Steering
Dmitrii Krasheninnikov · David Krueger
|
|
Workshop
|
|
Can sparse autoencoders be used to decompose and interpret steering vectors?
Harry Mayne · Yushi Yang · Adam Mahdi
|
|
Workshop
|
|
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering
Joris Postmus · Steven Abreu
|
|
Poster
|
Thu 16:30
|
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
Yuanpu Cao · Tianrong Zhang · Bochuan Cao · Ziyi Yin · Lu Lin · Fenglong Ma · Jinghui Chen
|
|
Poster
|
Thu 11:00
|
Analysing the Generalisation and Reliability of Steering Vectors
Daniel Tan · David Chanin · Aengus Lynch · Brooks Paige · Dimitrios Kanoulas · Adrià Garriga-Alonso · Robert Kirk
|
|
Workshop
|
|
Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks
Madeline Brumley · Joe Kwon · David Krueger · Dmitrii Krasheninnikov · Usman Anwar
|
|