Poster
in
Workshop: Workshop on Machine Learning Safety

Steering Large Language Models using APE

Yongchao Zhou ⋅ Andrei Muresanu ⋅ Ziwen Han ⋅ Keiran Paster ⋅ Silviu Pitis ⋅ Harris Chan ⋅ Jimmy Ba

[ Poster]

Abstract

By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer the model. Due to the lack of knowledge of how LLMs work, most effective prompts have been handcrafted by humans through a demanding trial and error process. To reduce the human effort involved in this alignment process, we propose Automatic Prompt Engineer (APE) for automatic instruction generation and selection. We treat the instruction as the "program," optimized by searching over a pool of instruction candidates proposed by an LLM in order to maximize a chosen score function. To evaluate how well the selected instruction can steer the model to desired behavior, we evaluate the zero-shot performance of another LLM following the selected instruction. Experiments on 24 NLP tasks show that our automatically generated instructions outperform the prior LLM baseline by a large margin and achieve better or comparable performance to the instructions generated by human annotators on 19/24 tasks. Moreover, we show that APE-engineered prompts can be applied to steer models toward truthfulness and/or informativeness. Please check out our webpage at https://sites.google.com/view/automatic-prompt-engineer.

Chat is not available.