Skip to yearly menu bar Skip to main content


ContextCite: Attributing Model Generation to Context

Benjamin Cohen-Wang · Harshay Shah · Kristian Georgiev · Aleksander Madry

East Exhibit Hall A-C #3407
[ ] [ Project Page ]
Wed 11 Dec 11 a.m. PST — 2 p.m. PST


How do language models use information provided as context when generating a response?Can we infer whether a particular generated statement is actually grounded in the context, a misinterpretation, or fabricated?To help answer these questions, we introduce the problem of context attribution: pinpointing the parts of the context (if any) that led a model to generate a particular statement.We then present ContextCite, a simple and scalable method for context attribution that can be applied on top of any existing language model.Finally, we showcase the utility of ContextCite through three applications:(1) helping verify generated statements(2) improving response quality by pruning the context and(3) detecting poisoning attacks.We provide code for ContextCite at

Live content is unavailable. Log in and register to view live content