Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

11 Results

<<   <   Page 1 of 1   >>   >
Workshop
From Correlation to Causation: Understanding Climate Change through ML and LLM Inquiries
Shan Shan
Workshop
Sat 15:45 Towards LLM-guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection
Giorgos Iacovides · Wuyang Zhou · Danilo Mandic
Workshop
INTERPRETABILITY OF LLM DECEPTION: UNIVERSAL MOTIF
Wannan Yang · Chen Sun · Gyorgy Buzsaki
Poster
Wed 11:00 LLM Circuit Analyses Are Consistent Across Training and Scale
Curt Tigges · Michael Hanna · Qinan Yu · Stella Biderman
Workshop
Extracting Paragraphs from LLM Token Activations
Nicky Pochinkov · Angelo Benoit · Lovkush Agarwal · Zainab Ali Majid · Lucile Ter-Minassian
Workshop
Sat 15:45 Reexpress: Similarity-Distance-Magnitude Calibration
Allen Schmaltz
Poster
Thu 16:30 Transcoders find interpretable LLM feature circuits
Jacob Dunefsky · Philippe Chlenski · Neel Nanda
Workshop
LoFiT: Localized Fine-tuning on LLM Representations
Fangcong Yin · Xi Ye · Greg Durrett
Workshop
Sat 15:45 Bayesian Concept Bottleneck Models with LLM Priors
Jean Feng · Avni Kothari · Lucas Zier · Chandan Singh · Yan Shuo Tan
Workshop
Uncovering Uncertainty in Transformer Inference
Greyson Brothers · Willa Mannering · John Winder · Amber Tien
Workshop
HarmAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Jing-Jing Li · Valentina Pyatkin · Max Kleiman-Weiner · Liwei Jiang · Nouha Dziri · Anne Collins · Jana Schaich Borg · Maarten Sap · Yejin Choi · Sydney Levine