Poster
in
Workshop: Workshop on Multi-Turn Interactions in Large Language Models

Improved Multi-Agent Collaboration with Multi-Turn Reinforcement Learning

Shuo Liu ⋅ Tianle Chen ⋅ Christopher Amato

2025 Poster
in
Workshop: Workshop on Multi-Turn Interactions in Large Language Models

Project Page [ OpenReview]

Abstract

A large amount of work has been done in Multi-Agent Systems (MAS) for modeling and solving problems with multiple interacting agents. However, most LLMs are pretrained independently and not specifically optimized for coordination. Existing LLM fine-tuning frameworks rely on individual rewards, which require complex reward designs for each agent to encourage collaboration. To address these challenges, we model LLM collaboration as a cooperative Multi-Agent Reinforcement Learning (MARL) problem. We develop a multi-agent, multi-turn algorithm, Multi-Agent Group Relative Policy Optimization (MAGRPO), to solve it, building on current RL approaches for LLMs as well as MARL techniques. Our experiments on coding collaboration demonstrate that fine-tuning MAS with MAGRPO enables agents to generate high-quality responses efficiently through effective cooperation. Our approach opens the door to using other MARL methods for LLMs and highlights the associated challenges.

Chat is not available.