Datasets and Benchmarks: Dataset and Benchmark Poster Session 3

Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing

Sarah Wiegreffe · Ana Marasovic

[ ]
[ Chat
[ Paper ]


Explainable Natural Language Processing (ExNLP) has increasingly focused on collecting human-annotated textual explanations. These explanations are used downstream in three ways: as data augmentation to improve performance on a predictive task, as supervision to train models to produce explanations for their predictions, and as a ground-truth to evaluate model-generated explanations. In this review, we identify 65 datasets with three predominant classes of textual explanations (highlights, free-text, and structured), organize the literature on annotating each type, identify strengths and shortcomings of existing collection methodologies, and give recommendations for collecting ExNLP datasets in the future.

Chat is not available.