Workshop
|
|
Interpretable AI in Human-Machine Systems: Insights from Human-in-the-Loop Product Recommendation Engines
Pooria Assadi · NIMA SAFAEI
|
|
Workshop
|
|
Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations
Kola Ayonrinde · Michael Pearce
|
|
Workshop
|
Sat 15:45
|
Reexpress: Similarity-Distance-Magnitude Calibration
Allen Schmaltz
|
|
Workshop
|
Sat 15:45
|
Interactive Semantic Interventions for VLMs: A Human-in-the-Loop Approach to Interpretability
Lukas Klein · Kenza Amara · Carsten Lüth · Hendrik Strobelt · Mennatallah El-Assady · Paul Jaeger
|
|
Workshop
|
Sat 15:45
|
Bayesian Concept Bottleneck Models with LLM Priors
Jean Feng · Avni Kothari · Lucas Zier · Chandan Singh · Yan Shuo Tan
|
|
Workshop
|
|
Decomposing and Editing Predictions by Modeling Model Computation
Harshay Shah · Andrew Ilyas · Aleksander Madry
|
|
Workshop
|
|
NODE-GAMLSS: Interpretable Uncertainty Modelling via Deep Distributional Regression
Ananyapam De · Anton Thielmann · Benjamin Säfken
|
|
Workshop
|
|
Scalable and interpretable quantum natural language processing: an implementation on trapped ions
Tiffany Duneau · Saskia Bruhn · Gabriel Matos · Tuomas Laakkonen · Katerina Saiti · Anna Pearson · Konstantinos Meichanetzidis · Bob Coecke
|
|
Workshop
|
|
Semantic Entropy Neurons: Encoding Semantic Uncertainty in the Latent Space of LLMs
Jiatong Han · Jannik Kossen · Muhammed Razzak · Yarin Gal
|
|
Workshop
|
|
Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations
Kola Ayonrinde · Michael Pearce · Lee Sharkey
|
|
Workshop
|
|
SPRINT Enables Interpretable and Ultra-Fast Virtual Screening against Thousands of Proteomes
Andrew McNutt · Abhinav Adduri · Caleb Ellington · Monica Dayao · Eric Xing · Hosein Mohimani · David Koes
|
|
Workshop
|
|
An Adversarial Perspective on Machine Unlearning for AI Safety
Jakub Łucki · Boyi Wei · Yangsibo Huang · Peter Henderson · Florian Tramer · Javier Rando
|
|