Skip to yearly menu bar Skip to main content


In-person presentation
in
Workshop: Attributing Model Behavior at Scale (ATTRIB)

Data attribution for LMMs and beyond (James Zou)


Abstract:

I will discuss DataInf, an efficient influence approximation method that is practical for large-scale generative AI models. Our theoretical analysis shows that DataInf is particularly well-suited for parameter-efficient fine-tuning techniques such as LoRA. In applications to generative models such as Llama-2 and stable-diffusion, DataInf effectively identifies the most influential fine-tuning examples and is substantially faster than previous methods. Moreover, it can help to identify which data points are mislabeled.

Chat is not available.