Poster
in
Workshop: Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III): Towards the Future of Large Language Models and their Emerging Descendants

Lightweight Retrieval Tuning for Black-Box Language Models

Xiao-Wen Yang · Hong-Jie You · Pengxiao Song · Hao-Ran Hao · Jie-Jing Shao · Yu-Feng Li

Abstract

Retrieval-augmented language models have demonstrated remarkable effectiveness, particularly in knowledge-intensive tasks. Previous studies on retrieval augmentation typically require tuning the parameters of language models or updating the vector datastore, resulting in huge computational costs. However, it becomes infeasible as the scale of language models and the vector datastore continues to increase, especially when language models are only accessible through APIs. Hence, we treat the language model as a black box and keep the vector datastore frozen. We propose a lightweight retrieval tuning technique by introducing a self-adapted similarity matching module, employing less than 1M parameters. Proximal Policy Optimization (PPO) is utilized to fine-tune the introduced parameters because the black-box language models cannot be trained end-to-end. Our approach exhibits great scalability as it can be employed in any scenario, regardless of the frozen vector datastore and the black-box language model. Moreover, our approach has high training efficiency, the speed bottleneck of which lies in the inference of the black-box language models. Experiments conducted on the MMLU and TrivialQA benchmarks demonstrate that our lightweight retrieval tuning technique significantly improves the performance of retrieval augmentation across different scales and architectures of language models. Specifically, our method improves InstructGPT's performance on the MMLU benchmark by 6%.

Chat is not available.