InfraGym: Empowering LLM Agents for Real-World Computer System Optimization
Abstract
Large language model (LLM) agents have demonstrated high potential of improving performance for complex computer system, such as cluster scheduling, network congestion control, and adaptive video streaming. However, in lack of a standard, safe, and extensible benchmarking platform, it is difficult to evaluate whether these LLM agents improve real-world system performance and by how much. We present InfraGym, an open, extensible platform where researchers can study computer system optimization with LLM agents. Our current release includes three real-world cases and supports interaction with both simulated and real environments. We benchmark multiple LLM agents on these tasks using both open-source and closed-source LLMs, and outline future directions. The code is available at \hyperlink{https://github.com/MLSysOps/InfraGym}{https://github.com/MLSysOps/InfraGym}