Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Multi-Agent Security: Security as Key to AI Safety

Assessing Risks of Using Autonomous Language Models in Military and Diplomatic Planning

Gabe Mukobi · Ann-Katrin Reuel · Juan-Pablo Rivera · Chandler Smith

Keywords: [ AI ] [ military ] [ high-stakes ] [ Large language models ] [ artificial intelligence ] [ multi-agent ] [ foundation models ] [ decision-making ]

[ ] [ Project Page ]
 
presentation: Multi-Agent Security: Security as Key to AI Safety
Sat 16 Dec 7 a.m. PST — 3:30 p.m. PST

Abstract:

The potential integration of autonomous agents in high-stakes military and foreign-policy decision-making has gained prominence, especially with the emergence of advanced generative AI models like GPT-4. This paper aims to scrutinize the behavior of multiple autonomous agents in simulated military and diplomacy scenarios, specifically focusing on their potential to escalate conflicts. Drawing on established international relations frameworks, we assessed the escalation potential of decisions made by these agents in different scenarios. Contrary to prior qualitative studies, our research provides both qualitative and quantitative insights. We find that there are significant differences in the models' predilections to escalate, with Claude 2 being the least aggressive and GPT-4-Base the most aggressive models. Our findings indicate that, even in seemingly neutral contexts, language-model-based autonomous agents occasionally opt for aggressive or provocative actions. This tendency intensifies in scenarios with predefined trigger events. Importantly, the patterns behind such escalatory behavior remain largely unpredictable. Furthermore, a qualitative analysis of the models' verbalized reasoning, particularly in the GPT-4-Base model, reveals concerning justifications. Given the high stakes involved in military and foreign-policy contexts, the deployment of such autonomous agents demands further examination and cautious consideration.

Chat is not available.