Poster
in
Workshop: LAW 2025: Bridging Language, Agent, and World Models for Reasoning and Planning

Test-Time Scaling for Multistep Reasoning in Small Language Models via A* Search

Alexander Braverman · Weitong Zhang · Quanquan Gu

Project Page [ OpenReview]

Abstract

Large language models (LLMs) have demonstrated strong abilities across various tasks but are costly in computation and memory. In contrast, Small Language Models (SLMs) offer significant advantages in efficiency and deployability but usually struggle with complex mathematical reasoning tasks. To tackle this issue, we present the Test-time A* Search (TTA), a test-time scaling framework that casts reasoning as a goal-directed search over a tree of partial solutions in this paper. TTA is training-free and requires no external supervision or multi-model structure, making it practical in resource-constrained settings. As a drop-in decoding wrapper for SLMs, TTA* systematically explores, critiques, and refines candidate solution paths via its own self-reflection capability. Extensive experiments on popular mathematical reasoning benchmarks and a variety of base models show that TTA* consistently improves accuracy and robustness, indicating broad applicability to general mathematical reasoning tasks.

Chat is not available.