NeurIPS Poster $\texttt{ConflictBank}$: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLMs

Poster

$\texttt{ConflictBank}$ : A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLMs

Zhaochen Su · Jun Zhang · Xiaoye Qu · Tong Zhu · Yanshu Li · Jiashuo Sun · Juntao Li · Min Zhang · Yu Cheng

East Exhibit Hall A-C #4701

[ Abstract ]

[ Paper]

Wed 11 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Large language models (LLMs) have achievedimpressive advancements across numerous disciplines, yet the critical issue of knowledge conflicts, a major source of hallucinations, has rarely been studied. While a few research explored the conflicts between the inherent knowledge of LLMs and the retrieved contextual knowledge, a comprehensive assessment of knowledge conflict in LLMs is still missing. Motivated by this research gap, we firstly propose ConflictBank, the largest benchmark with 7.45M claim-evidence pairs and 553k QA pairs, addressing conflicts from misinformation, temporal discrepancies, and semantic divergences.Using ConflictBank, we conduct the thorough and controlled experiments for a comprehensive understanding of LLM behavior in knowledge conflicts, focusing on three key aspects: (i) conflicts encountered in retrieved knowledge, (ii) conflicts within the models' encoded knowledge, and (iii) the interplay between these conflict forms.Our investigation delves into four model families and twelve LLM instances and provides insights into conflict types, model sizes, and the impact at different stages.We believe that knowledge conflicts represent a critical bottleneck to achieving trustworthy artificial intelligence and hope our work will offer valuable guidance for future model training and development.Resources are available at https://github.com/zhaochen0110/conflictbank.

Chat is not available.

Poster

ConflictBankConflictBank\texttt{ConflictBank}: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLMs

Zhaochen Su · Jun Zhang · Xiaoye Qu · Tong Zhu · Yanshu Li · Jiashuo Sun · Juntao Li · Min Zhang · Yu Cheng

East Exhibit Hall A-C #4701

$\texttt{ConflictBank}$ : A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLMs