NeurIPS Poster Can Models Learn Skill Composition from Examples?

Poster

Can Models Learn Skill Composition from Examples?

Haoyu Zhao · Simran Kaur · Dingli Yu · Anirudh Goyal · Sanjeev Arora

West Ballroom A-D #6908

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Thu 12 Dec 11 a.m. PST — 2 p.m. PST

Abstract: As large language models (LLMs) become increasingly advanced, their ability to exhibit compositional generalization---the capacity to combine learned skills in novel ways not encountered during training---has garnered significant attention. This type of generalization, particularly in scenarios beyond training data, is also of great interest in the study of AI safety and alignment. A recent study introduced the Skill-Mix evaluation, where models are tasked with composing a short paragraph demonstrating the use of a specified

k

$k$ -tuple of language skills. While small models struggled with composing even with

k = 3

$k=3$ , larger models like GPT-4 performed reasonably well with

k = 5

$k=5$ and

6

$6$ .In this paper, we employ a setup akin to Skill-Mix to evaluate the capacity of smaller models to learn compositional generalization from examples. Utilizing a diverse set of language skills---including rhetorical, literary, reasoning, theory of mind, and common sense---GPT was used to generate text samples that exhibit random subsets of

k

$k$ skills. Subsequent fine-tuning of 7B and 13B parameter models on these combined skill texts, for increasing values of

k

$k$ , revealed the following findings: (1) Training on combinations of

k = 2

$k=2$ and

3

$3$ skills results in noticeable improvements in the ability to compose texts with

k = 4

$k=4$ and

5

$5$ skills, despite models never having seen such examples during training. (2) When skill categories are split into training and held-out groups, models significantly improve at composing texts with held-out skills during testing despite having only seen training skills during fine-tuning, illustrating the efficacy of the training approach even with previously unseen skills.This study also suggests that incorporating skill-rich (potentially synthetic) text into training can substantially enhance the compositional capabilities of models.

Chat is not available.