Skip to yearly menu bar Skip to main content


Poster

RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models

Shuhao Chen · Weisen Jiang · Baijiong Lin · James Kwok · Yu Zhang

West Ballroom A-D #6900
[ ]
Thu 12 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Recent works show that assembling multiple off-the-shelf large language models (LLMs) can harness their complementary abilities. To achieve this, routing is a promising method, which learns a router to select the most suitable LLM for each query. However, existing routing models are ineffective when multiple LLMs perform well for a query. To address this problem, in this paper, we propose a method called query-based Router by Dual Contrastive learning (RouterDC). The RouterDC model, which consists of an encoder and LLM embeddings, is trained by two proposed contrastive losses (sample-LLM and sample-sample losses). Experimental results show that RouterDC is effective in assembling LLMs and largely outperforms individual top-performing LLMs as well as existing routing methods on both in-distribution (+2.76\%) and out-of-distribution (+1.90\%) tasks. The source code is available at https://github.com/shuhao02/RouterDC.

Live content is unavailable. Log in and register to view live content