Timezone: »

SalKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning
Aaron Chan · Jiashu Xu · Boyuan Long · Soumya Sanyal · Tanishq Gupta · Xiang Ren

Thu Dec 09 04:30 PM -- 06:00 PM (PST) @

Augmenting pre-trained language models with knowledge graphs (KGs) has achieved success on various commonsense reasoning tasks. However, for a given task instance, the KG, or certain parts of the KG, may not be useful. Although KG-augmented models often use attention to focus on specific KG components, the KG is still always used, and the attention mechanism is never explicitly taught which KG components should be used. Meanwhile, saliency methods can measure how much a KG feature (e.g., graph, node, path) influences the model to make the correct prediction, thus explaining which KG features are useful. This paper explores how saliency explanations can be used to improve KG-augmented models' performance. First, we propose to create coarse (Is the KG useful?) and fine (Which nodes/paths in the KG are useful?) saliency explanations. Second, to motivate saliency-based supervision, we analyze oracle KG-augmented models which directly use saliency explanations as extra inputs for guiding their attention. Third, we propose SalKG, a framework for KG-augmented models to learn from coarse and/or fine saliency explanations. Given saliency explanations created from a task's training set, SalKG jointly trains the model to predict the explanations, then solve the task by attending to KG features highlighted by the predicted explanations. On three popular commonsense QA benchmarks (CSQA, OBQA, CODAH) and a range of KG-augmented models, we show that SalKG can yield considerable performance gains --- up to 2.76% absolute improvement on CSQA.

Author Information

Aaron Chan (University of Southern California)

CS PhD Student at USC

Jiashu Xu (University of Southern California)

USC student double majoring in Applied Math and Computer Science

Boyuan Long (University of Southern California)
Soumya Sanyal (University of Southern California)
Tanishq Gupta (Indian Institute of Technology Delhi)
Xiang Ren (University of Southern California)

More from the Same Authors