Do Language Models Robustly Acquire New Knowledge?
Harshay Shah · Badih Ghazi · Yangsibo Huang · Ravi Kumar · Da Yu · Chiyuan Zhang
Abstract
Language models acquire vast knowledge during pretraining, but adding new knowledge to pre-trained models often lacks robustness—models can retrieve individual facts but struggle with multi-hop reasoning over newly acquired knowledge and its implications. To systematically study this robustness gap, we introduce RANK (Robust Acquisition of New Knowledge), a testbed using synthetic knowledge graphs to evaluate knowledge acquisition via $k$-hop reasoning tasks of increasing complexity. Our evaluation of supervised fine-tuning (SFT) and in-context learning (ICL) using RANK reveals that ICL performance degrades with reasoning complexity and knowledge scale, while SFT trained on simple facts fails completely at multi-hop reasoning. However, we find that increasing training data diversity induces a sharp phase transition of fine-tuned models from memorization to out-of-distribution generalization. More generally, RANK enables controlled experiments that reveal insights into knowledge acquisition robustness.
Video
Chat is not available.
Successful Page Load