Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

7 Results

<<   <   Page 1 of 1   >>   >
Workshop
Can sparse autoencoders be used to decompose and interpret steering vectors?
Harry Mayne · Yushi Yang · Adam Mahdi
Workshop
Overcoming Limitations of Steering Vectors with Low-Rank Representation Steering
Dmitrii Krasheninnikov · David Krueger
Workshop
Can sparse autoencoders be used to decompose and interpret steering vectors?
Harry Mayne · Yushi Yang · Adam Mahdi
Workshop
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering
Joris Postmus · Steven Abreu
Poster
Thu 16:30 Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
Yuanpu Cao · Tianrong Zhang · Bochuan Cao · Ziyi Yin · Lu Lin · Fenglong Ma · Jinghui Chen
Poster
Thu 11:00 Analysing the Generalisation and Reliability of Steering Vectors
Daniel Tan · David Chanin · Aengus Lynch · Brooks Paige · Dimitrios Kanoulas · Adrià Garriga-Alonso · Robert Kirk
Workshop
Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks
Madeline Brumley · Joe Kwon · David Krueger · Dmitrii Krasheninnikov · Usman Anwar