Active Inference Control: Steering, Not Just Scaling, Language Model Reasoning
Abstract
Large Language Models (LLMs) excel at multi-step reasoning but are hindered by the sub-optimal allocation of their computational budget. Recent work, such as s1, has shown that increasing the token budget can improve performance, but relies on a static, pre-defined budget that is inefficient and fails to adapt to the dynamic nature of the reasoning process. In this work, we argue that the paradigm of static budget allocation is fundamentally limited. We introduce the Active Inference Controller (AIC), a novel closed-loop control system that dynamically steers the LLM's reasoning process in real-time. The AIC assesses the semantic trajectory of the model's thought process at each step and makes one of three decisions: continue generation, terminate on a high-confidence solution, or intervene to correct a failing path. We train a lightweight XGBoost classifier as our AIC, using the s1 model's own internal embeddings as a high-fidelity performance signal. In a rigorous comparative analysis across the GPQA Diamond, GSM8K, and OpenBookQA benchmarks, our AIC-steered system significantly outperforms a strong s1 baseline, achieving higher accuracy through a policy built for greater computational intelligence. Our work demonstrates the viability of a new paradigm for inference: active, intelligent process control over static, brute-force scaling.