Poster
in
Workshop: The First Workshop on Efficient Reasoning

PREMISE: Scalable and Strategic Prompt Optimization for Efficient Mathematical Reasoning in Large Models

Ye Yu · Yaoning Yu · Haibo Jin · Haohan Wang

Project Page [ OpenReview]

Abstract

Large Reasoning Models (LRMs) like Claude 3.7 Sonnet and OpenAI o1 achieve strong performance on mathematical tasks via long Chain-of-Thought (CoT), but often generate unnecessarily verbose reasoning traces. This inflates token usage and cost, limiting deployment in latency-sensitive or API-constrained settings. To address this issue, we present \textbf{PREMISE} (\textit{PRompt-based Efficient Mathematical Inference with Strategic Evaluation}), an optimization framework designed specifically for black-box commercial LRMs. PREMISE reduces reasoning overhead without modifying model weights or requiring multiple queries. It combines trace-level diagnostics with gradient-based prompt optimization to minimize redundant computation while preserving answer accuracy. Across GSM8K, SVAMP, and Math500, PREMISE matches or exceeds baseline accuracy, while reducing reasoning tokens by up to \textbf{87.5\%} and cutting dollar cost by \textbf{69--82\%}.

Chat is not available.